Abstract
We introduce a new characterization of the Cauchy distribution and propose a class of goodness-of-fit tests for the Cauchy family. The limit distribution is derived in a Hilbert space framework under the null hypothesis. The new tests are consistent against a large class of alternatives. A comparative Monte Carlo simulation study shows that the test is a good competitor for the state of the art procedures, and we apply the tests to log-returns of cryptocurrencies.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this article we dedicate our studies to answer the question of whether a data set of univariate real numbers belongs to the famous family of Cauchy distributions. The Cauchy distribution is undoubtedly the standard example for a distribution without existing mean value and was studied in the mathematical world for more than 300 years, having wide applicability in diverse fields ranging from modeling resonances in physics (then often called Lorentz distribution) to cryptocurrencies in finance, see Szczygielski et al. (2020). It is also known as the Breit–Wigner distribution, for an extensive historical overview, see Stigler (1974). To be precise, we write shorthand C\((\alpha ,\beta )\), \(\alpha \in \mathbb {R}\), \(\beta > 0\), for the Cauchy distribution with location parameter \(\alpha \) and scale parameter \(\beta \), having density
For a detailed discussion on this family, see Johnson et al. (1994), Chapter 16. The Cauchy distribution is a so called heavy tailed distribution and is a member of the stable distributions, see Nolan (2020). Note that \(X \sim \) C\((\alpha , \beta )\) if, and only if, \((X-\alpha )/\beta \sim \) C(0, 1) and hence the Cauchy distribution belongs to the location-scale family of distributions. In the following we denote the family of Cauchy distributions by \(\mathcal {C}:=\{\text{ C }(\alpha ,\beta ):\,\alpha \in \mathbb {R},\beta >0\}\), a family of distributions which is closed under translation and rescaling. We test the composite hypothesis
against general alternatives on the basis of independent identical copies \(X_1,\ldots ,X_n\) of X defined on an underlying probability space \((\Omega ,\mathcal {A},\mathbb {P})\). This testing problem has been considered in the literature: Gürtler and Henze (2000) propose a test procedure based on the empirical characteristic function and Matsui and Takemura (2005) extend this test by considering alternative estimation methods. More recently, Mahdizadeh and Zamanzade (2017) propose to use the likelihood ratio as in Zhang (2002) as well as the Kullback–Leibler (KL) distance, an idea that is extended in Mahdizadeh and Zamanzade (2019). A quantile based method is proposed in Rublik (2001) and compared to the classical omnibus procedures. An empirical power study of goodness-of-fit tests for the Cauchy model based on the empirical distribution function as the Kolmogorov–Smirnov test, the Cramér–von Mises test, the Kuiper test, the Anderson Darling and the Watson test is found in Onen et al. (2001). In Litvinova (2005), two tests for the standard Cauchy distribution based on characterizations are given; however, they are not designed for the composite hypothesis (1).
The novel procedure is based on the following new characterization of the standard Cauchy distribution.
Theorem 1.1
Let X be a random variable with absolutely continuous density p and \(\mathbb {E}\left[ \frac{|X|}{1+X^2}\right] < \infty \). Then X has a Cauchy distribution C(0, 1) if and only if
holds for all \(t\in \mathbb {R}\), where i denotes the imaginary unit.
Proof
For \(X\sim \text{ C }(0,1)\) a direct calculation shows the assertion. Let X be a random variable with absolutely continuous density function p(x) such that
holds for all \(t\in \mathbb {R}\). Note that since \(-it\mathbb {E}[\exp (itX)]\) is the Fourier-Stieltjes transform of the derivative \(p'(x)\) of p(x) we have
for all \(t\in \mathbb {R}\). By properties of the Fourier-Stieltjes transform, we hence note that p(x) must satisfy the ordinary differential equation
for almost all \(x\in \mathbb {R}\). The only solution satisfying \(\int _{-\infty }^\infty p(x)\text{ d }x=1\) is \(p(x)=f(x,0,1)\), \(x\in \mathbb {R}\), and \(X\sim \text{ C }(0,1)\) follows. \(\square \)
Remark 1.2
The characterization in Theorem 1.1 is related to the spectral representation of the Stein operator of the so called density approach pioneered in Stein et al. (2004). Note that the quotient in (2) is \(f'(x,0,1)/f(x,0,1)=-2x/(1+x^2)\), \(x\in \mathbb {R}\), and the set of so called test functions is given by \(\{\exp (itx):\, t\in \mathbb {R}\}\). For details on this approach see Anastasiou et al. (2022), Sect. 5.4.2.
This paper is organized as follows. In Sect. 2, we introduce a family of test statistics, denoted by \(T_{n,a}\), which is based on the characterization in Theorem 1.1. In Sect. 3, the limit distribution of \(T_{n,a}\) is derived in a Hilbert space framework under the null hypothesis. Furthermore, we derive the limit distribution of the statistic \({\widetilde{T}}_{n,0}=\lim _{a\rightarrow 0} aT_{n,a}\). Consistency of the new tests against a large class of alternatives is shown in Sect. 4. An extensive Monte Carlo simulation study in Sect. 5 shows that the test is a good competitor for the state of the art procedures. In Sect. 6, the tests are applied to log-returns of cryptocurrencies. Finally, conclusions are given in Sect. 7.
2 A new class of goodness of fit tests for the Cauchy distribution
The testing problem under discussion is invariant with respect to transformations of the kind \(x\rightarrow ax+b, \, x\in \mathbb {R}\), where \(a\in \mathbb {R}\) and \(b>0\). Consequently, a decision in favor or against \(H_0\) should be the same for \(X_1,\ldots ,X_n\) and \(aX_1+b,\ldots ,aX_n+b\). This goal is achieved if the test statistic \(T_n\), say, is based on the standardized data \(Y_{n,1},\ldots ,Y_{n,n}\), given by
Here, \(\widehat{\alpha }_n=\widehat{\alpha }_n(X_1,\ldots ,X_n)\) and \(\widehat{\beta }_n=\widehat{\beta }_n(X_1,\ldots ,X_n)\) denote consistent estimators of \(\alpha \in \mathbb {R}\) and \(\beta >0\) such that
holds for each \(a>0\) and \(b\in \mathbb {R}\). By (4) and (5) it is easy to see that \(Y_{n,j}\), \(j=1,\ldots ,n,\) do not depend on the location nor on the scale parameter. Hence, \(T_n\) has the property \(T_n(aX_1+b,\ldots ,aX_n+b)=T_n(X_1,\ldots ,X_n)\), and we may and do assume \(\alpha =0\) and \(\beta =1\) in the following. Motivated by Theorem 1.1, we choose the test statistic
which is the weighted \(L^2\)-distance from (2) to 0. Here, \(|\cdot |\) is the complex absolute value and \(\omega :\mathbb {R}\rightarrow (0,\infty )\) denotes a positive weight function that is given by \(\omega (t)=\omega _a(t)=\exp (-a|t|)\), \(t\in \mathbb {R}\). Note that \(\omega \) is symmetric around the origin, i.e. \(\omega (t)=\omega (-t)\) holds for all \(t\in \mathbb {R}\), and that
holds. We have the integration-free, numerical stable formula
and hence a whole family of tests depending on the so called tuning parameter \(a>0\). The next result deals with the limit behavior of \(T_{n,a}\) for \(a\rightarrow 0\) and \(a\rightarrow \infty \).
Theorem 2.1
For fixed n, we have
and
Proof
Splitting the sum in (8) in a diagonal and a non-diagonal part results in
say. Since
and \(\lim _{a\rightarrow 0}T_{n,a}^{nd}=0\), we obtain
By the symmetry of the weight function \(\omega _a(\cdot )\), straightforward calculations show
where
Then, \(aT_{n,a}=\int _{-\infty }^\infty Z_n^2(s/a)\,\exp (-|s|)\text{ d }s\), where \(Z_n\) satisfies \(Z_n^2(s/a)\le 4n(1+|s|)^2\), for \(a>1\), and
Now, the second assertion follows from an application of the dominated convergence theorem. \(\square \)
3 Limit distribution under the null hypothesis
The asymptotic theory is derived in the Hilbert space \(\mathbb {H}\) of measurable, square integrable functions \(\mathbb {H}=L^2(\mathbb {R},\mathcal {B}, \omega (t)\text{ d }t)\), where \(\mathcal {B}\) is the Borel-\(\sigma \)-field of \(\mathbb {R}\) and \(\omega (t)\) is the weight function defined in the introduction. Notice that \(T_n=\int _{-\infty }^\infty Z_n^2(t)\,\omega (t)\text{ d }t,\) where \(Z_n\) is defined in (11), is a real-valued \((\mathcal {A} \otimes \mathcal {B}, \mathcal {B})\)-measurable random element of \(\mathbb {H}\). We denote by
the usual norm and inner product in \(\mathbb {H}\). In the following, we assume that the estimators \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) allow linear representations
where \(X_1,\ldots ,X_n \sim C(0,1)\) are independent random variables, \(o_\mathbb {P}(1)\) denotes a term that converges to 0 in probability, and \(\psi _1\) und \(\psi _2\) are measurable functions with
here, C is a positive constant, \(^\top \) stands for the transpose of a vector and \(I_2\) is the \(2\times 2\)-identity matrix. See Remark 3.2 for examples of estimation procedures satisfying these assumptions.
Theorem 3.1
Let \(X_1,\ldots ,X_n\) be i.i.d. \(\text{ C }(\alpha ,\beta )\) distributed random variables. Then there exists a centred Gaussian random process \(\mathcal {Z}\) in \(\mathbb {H}\) with covariance kernel
for \(s,t\in \mathbb {R},\) such that \(T_n\overset{D}{\longrightarrow }\Vert \mathcal {Z}\Vert _\mathbb {H}^2\) as \(n\rightarrow \infty \).
A proof of Theorem 3.1 is found in Appendix A.1. It is well known that the distribution of \(\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2\) is that of \(\sum _{j=1}^\infty \lambda _jN_j^2\), where \(N_1,N_2,\ldots \) are i.i.d. standard normal random variables and \((\lambda _j)\) is a decreasing sequence of positive eigenvalues of the integral operator
Due to the complexitiy of the covariance kernel K as given in Theorem 3.1 it seems hopeless to solve the integral equation \(\mathcal {K}g(s)=\lambda g(s)\) and find explicit values of \(\lambda _j\), \(j\ge 1\). For a numerical approximation method we refer to Subsect. 3.3 in Matsui and Takemura (2005). Note that
and
can be derived for specific estimation procedures. Such results in the theory of goodness-of-fit tests for the Cauchy family are sparse, for some explicit formulae for mean values, see Gürtler and Henze (2000) and Matsui and Takemura (2005).
Remark 3.2
The generality of Theorem 3.1 in view of the linear representations of the estimators and hence dependence on the functions \(\psi _1\) and \(\psi _2\) leads to explicit covariance kernels for different parameter estimation procedures. To estimate \(\alpha \) and \(\beta \) we choose the following location and scale estimators \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\), which all satisfy (4) and (5) respectively. For a compact notation, we write \(\psi (x)=(\psi _1(x),\psi _2(x))^\top \). Some derivations were partially provided by the computer algebra system Maple, see Maplesoft (2019).
-
1.
Median and interquartile-distance estimators: Let \(\xi _p\), \(p\in (0,1)\), denote the p-quantile of the underlying distribution F, \(\widehat{\xi }_{p,n}\) the sample p-quantile, and \(X_{(1)}\le \cdots \le X_{(n)}\) the order statistics of \(X_1,\ldots ,X_n\). With \(\lfloor \cdot \rfloor \) denoting the floor function, let
$$\begin{aligned} \widehat{\alpha }_n={\left\{ \begin{array}{ll} \frac{1}{2}(X_{(\frac{n}{2})}+X_{(\frac{n}{2}+1)}), &{} \text {if}\,n\,\text{ even, }\\ X_{(\lfloor \frac{n}{2}\rfloor +1)}, &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$(16)be the unbiased empirical median and
$$\begin{aligned} \widehat{\beta }_n=\frac{1}{2}(\widehat{\xi }_{\frac{3}{4},n}-\widehat{\xi }_{\frac{1}{4},n}) \end{aligned}$$(17)the half-interquartile range (iqr) of the sample. Under mild regularity conditions \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) are consistent estimators of \(\alpha \) and \(\beta \). Display (3.3) of Gürtler and Henze (2000) then gives the so-called Bahadur representations (see Theorem 2.5.1 in Serfling (1980)) with
$$\begin{aligned} \psi _1(x)=\pi \left( \frac{1}{2}-{\textbf{1}}\lbrace x\le 0\rbrace \right) ,\quad \text{ and }\quad \psi _2(x)=\pi \left( \frac{1}{2}-{\textbf{1}}\lbrace -1\le x\le 1\rbrace \right) ,\ x\in \mathbb {R}. \end{aligned}$$It is easy to see that \(\mathbb {E}[\psi _1(X_1)]=\mathbb {E}[\psi _2(X_1)]=0\) and \(\mathbb {E}[\psi (X_1)\psi (X_1)^\top ]=\frac{\pi ^2}{4}I_{2}\) holds. With these representations, we get the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1
$$\begin{aligned} K_{MIQ}(s,t)&=\frac{1}{2}\big (s^2+t^2+\vert s-t\vert +1\big )e^{-\vert s-t\vert }\\&\quad +\Big [t\big (\vert t\vert +1\big )\big (2J_1(s)-sJ_2(s)\big )- \big (t^2+\vert t\vert +1\big )\big (\frac{s}{2}J_3(s)+J_4(s)\big )\Big ]e^{-\vert t\vert }\\&\quad +\Big [s\big (\vert s\vert +1\big )\big (2J_1(t)-tJ_2(t)\big )-\big (s^2+\vert s\vert +1\big ) \big (\frac{t}{2}J_3(t)+J_4(t)\big )\Big ]e^{-\vert s\vert }\\&\quad +\frac{\pi ^2}{16}\Big [\big (s^2+\vert s\vert +1\big )\big (t^2+\vert t\vert +1\big )+st\big (\vert s\vert +1\big )\big (\vert t\vert +1\big )\Big ]e^{-\vert s\vert -\vert t\vert }, \end{aligned}$$for \(s,t\in \mathbb {R},\) where
$$\begin{aligned} J_1(t)&=\int _0^1\frac{x\sin (tx)}{(1+x^2)^2}dx, J_2(t)=\int _0^1\frac{\cos (tx)}{1+x^2}dx, \\ J_3(t)&=\int _0^\infty \frac{\sin (tx)}{1+x^2}dx, J_4(t)=\int _0^\infty \frac{x\cos (tx)}{(1+x^2)^2}dx. \end{aligned}$$Direct calculations of integrals lead to
$$\begin{aligned} \mathbb {E}\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2= & {} \int _{-\infty }^\infty K_{MIQ}(t,t)\omega _a(t)\,\text{ d }t \\= & {} (8 \left( a+2 \right) ^{5} \left( 1+a \right) ^{3}{a}^{3} \left( {a}^{2}+2\,a+2 \right) ^{3})^{-1}\left[ \left( {\pi }^{2}-8 \right) {a}^{16} + \left( 19\,{\pi }^{2}-152 \right) {a}^{15} \right. \\{} & {} + \left( 173\,{\pi }^{2}-1368 \right) {a}^{14}+ \left( 1003\,{\pi }^{2}-7720 \right) {a}^{13} + \left( 4126\,{\pi }^{2}-30192 \right) {a}^{12}\\{} & {} + \left( 12594\,{\pi }^{2}-84304 \right) {a}^{11}+ \left( 29128\,{\pi }^{2}-163520 \right) {a}^{10} \\{} & {} + \left( 51460\,{\pi }^{2}-188832 \right) {a}^{9}+ \left( 69320\,{\pi }^{2}-8256 \right) {a}^{8}\\{} & {} + \left( 70296\,{\pi }^{2}+457664 \right) {a}^{7}+ \left( 52176\,{\pi }^{2}+1025920 \right) {a}^{6}\\{} & {} + \left( 26848\,{\pi }^{2}+1323264 \right) {a}^{5}+ \left( 8576\,{\pi }^{2}+1151488 \right) {a}^{4} \\{} & {} \left. + \left( 1280\,{\pi }^{2}+693248 \right) {a}^{3}+280576\,{a}^{2}+69632\,a+8192\right] . \end{aligned}$$ -
2.
Maximum likelihood estimators: Matsui and Takemura (2005) show in Lemma A.1 that for the maximum-likelihood estimator \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) in the Cauchy family the linear representations are given by
$$\begin{aligned} \psi _1(x)=\frac{4x}{1+x^2},\quad \text{ and }\quad \psi _2(x)=\frac{2(x^2-1)}{1+x^2},\;x\in \mathbb {R}. \end{aligned}$$Again, straightforward calculations show \(\mathbb {E}[\psi _1(X_1)]=\mathbb {E}[\psi _2(X_1)]=0\) as well as \(\mathbb {E}[\psi (X_1)\psi (X_1)^\top ]=2 I_{2}\). Note that there are no closed form expressions for the estimators, such that the log-likelihood equations have to be solved numerically. This leads to the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1 given by
$$\begin{aligned} K_{ML}(s,t)&= \frac{1}{2}\big (s^2+t^2+\vert s-t\vert +1\big )e^{-\vert s-t\vert } \\&\quad -\frac{1}{2}(t^2+\vert t\vert +1)(s^2+\vert s\vert +1)e^{-\vert t\vert -\vert s\vert } \\&\quad -\frac{1}{2}t(\vert t\vert +1)s(\vert s\vert +1)e^{-\vert t\vert -\vert s\vert },\quad s,t\in \mathbb {R}. \end{aligned}$$With this explicit formula for the covariance kernel, we compute
$$\begin{aligned} \mathbb {E}\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2=\int _{-\infty }^\infty K_{ML}(t,t)\omega _a(t)\,\text{ d }t={\frac{8\,{a}^{4}+80\,{a}^{3}+352\,{a}^{2}+320\,a+128}{{a}^{3} \left( a+2 \right) ^{5}}} \end{aligned}$$and
$$\begin{aligned} \text{ Var }\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2= & {} 2\int _{-\infty }^\infty \int _{-\infty }^ \infty K_{ML}^2(s,t)\omega _a(s)\omega _a(t)\,\text{ d }t\text{ d }s\\{} & {} = \left( \frac{1}{2}{a}^{5} \left( a+1 \right) ^{7} \left( a+2 \right) ^{10}\right) ^{-1} \left[ 10\,{a}^{14}+270\,{a}^{13}+3521\,{a}^{12}\right. \\{} & {} +27987\,{a}^{11}+146819\,{a}^{10}+510582\,{a}^{9}+1194078\,{a}^{8}+1914216\,{a}^{7}\\{} & {} +2134432\,{a}^{6}+1671456\,{a}^{5}+928192\,{a}^{4}+369792\,{a}^{3}+104192\,{a}^{2}\\{} & {} \left. +18432\,a+1536\right] . \end{aligned}$$ -
3.
Equivariant integrated squared error estimator: In Matsui and Takemura (2005) the authors propose an equivariant version of the minimal integrated squared error estimator introduced by Besbeas and Morgan (2001) for the Cauchy distribution. The estimators are derived by minimization of the weighted \(L^2\)-distance
$$\begin{aligned} I(\alpha ,\beta )=\int _{-\infty }^{\infty }\left| \varphi _n(t;\alpha ,\beta )-e^{-\vert t\vert }\right| ^2\omega (t)dt, \end{aligned}$$where \(\varphi _n(t;\alpha ,\beta )=\frac{1}{n}\sum _{j=1}^n \exp \left( it(X_j-\alpha )/\beta \right) \), \(t\in \mathbb {R}\), is the empirical characteristic function of a distribution from the location scale family in dependence of the parameters. As weight function the authors chose the same weight function \(\omega _\nu (t)=\exp (-\nu |t|)\), \(t\in \mathbb {R}\), \(\nu >0\), and hence get families of estimators \(\widehat{\alpha }_{n,\nu }\) and \(\widehat{\beta }_{n,\nu }\) in dependence of \(\nu \). Note that the optimization problem has to be solved numerically and no closed-form formula is known for the estimators. We denote this class of estimators hereinafter EISE. Lemma A.1 in Matsui and Takemura (2005) provides the linear asymptotic representations
$$\begin{aligned} \psi _1(x,\nu )&= (\nu +1)(\nu +2)^3\frac{x}{((\nu +1)^2+x^2)^2}, \\ \psi _2(x,\nu )&= \frac{1}{2}(\nu +2)-\frac{1}{2}(\nu +2)^3\frac{(\nu +1)^2-x^2}{((\nu +1)^2+x^2)^2}, \end{aligned}$$for \(x\in \mathbb {R}\), which lead for every \(\nu >0\) to an explicit expression of the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1, which can be found in Appendix A.3.
As demonstrated in Remark 3.2, the estimation procedure has some influence on the limit null distribution of \(T_n\). We refer to Chen (2011); Fegyverneki (2013) for alternative estimators of the parameters of the Cauchy distribution and to Cohen Freue (2007) for a Pitman estimator of the location parameter \(\alpha \) when \(\beta \) is known.
3.1 Asymptotic normality of the limit statistic under the null hypothesis
The next result gives the limiting distribution of the scaled limiting statistic \(aT_{n,a}\) for \(a\rightarrow 0\) given in (9).
Theorem 3.3
-
(a)
Let \({{\,\textrm{arcsine}\,}}(a,b)\) denote the arcsine distribution on [a, b] with distribution function
$$\begin{aligned} F(x)=\frac{2}{\pi }\arcsin \sqrt{\frac{x-a}{b-a}},a\le x\le b. \end{aligned}$$If \(X\sim C(0,1)\), then
$$\begin{aligned} \frac{4X^2}{(1+X^2)^2}&\sim {{\,\textrm{arcsine}\,}}(0,1). \end{aligned}$$ -
(b)
Let \(X_1,\ldots ,X_n\) be i.i.d. \(\text{ C }(0,1)\) distributed random variables, and let \(Y_{n,j}\) be the corresponding scaled residuals, using any estimator \(({\widehat{\alpha }},{\widehat{\beta }})\) with linear representation. Then,
$$\begin{aligned} {\widetilde{T}}_{n,0}&= \sqrt{2n} \left( \frac{8}{n} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2} - 1\right) \ {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\ N(0,1), \end{aligned}$$(18)where N(0, 1) denotes the standard normal distribution.
Proof
-
(a)
Let U have a uniform distribution on \((-\pi /2,\pi /2)\), and put \(X=\tan U\). Then, \(X\sim C(0,1)\). Further, \(\sin U, \, X^2/(1+X^2)=\sin ^2 U, \, 1/(1+X^2)=\cos ^2 U\) as well as \(2\sin U \cos U\) have an \({{\,\textrm{arcsine}\,}}(-1,1)\) distribution (see, e.g., Norton (1983)). If \(Z \sim {{\,\textrm{arcsine}\,}}(-1,1)\), then \(Z^2 \sim {{\,\textrm{arcsine}\,}}(0,1)\). Hence,
$$\begin{aligned} \frac{4X^2}{(1+X^2)^2}&\sim 4 \sin ^2 U\cos ^2 U \sim {{\,\textrm{arcsine}\,}}(0,1). \end{aligned}$$ -
(b)
A Taylor expansion around \((\alpha _0,\beta _0)=(0,1)\) yields
$$\begin{aligned}&\frac{1}{\sqrt{n}} \sum _{j=1}^n \frac{(X_j-\alpha )^2/\beta ^2}{(1+(X_j-\alpha )^2/\beta ^2)^2} \\&\quad = \frac{1}{\sqrt{n}} \sum _{j=1}^n \left\{ \frac{X_j^2}{(1+X_j^2)^2} + \frac{2X_j(X_j^2-1)\alpha }{(1+X_j^2)^3} + \frac{2X_j^2(X_j^2-1)(\beta -1)}{(1+X_j^2)^3} \right. \\&\quad \left. + f_1(X_j,{\tilde{\alpha }},{\tilde{\beta }}) \alpha ^2 + f_2(X_j,{\tilde{\alpha }},{\tilde{\beta }}) \alpha (\beta -1) + f_3(X_j,{\tilde{\alpha }},{\tilde{\beta }})(\beta -1)^2 \right\} , \end{aligned}$$where \(({\tilde{\alpha }},{\tilde{\beta }})\) lies between \((\alpha ,\beta )\) and (0, 1), and
$$\begin{aligned} f_1(x,\alpha ,\beta )&= \frac{2}{\beta ^2} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{20(x-\alpha )^2}{\beta ^4} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^4}{\beta ^6} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}, \\ f_2(x,\alpha ,\beta )&= \frac{4(x-\alpha )}{\beta ^3} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{24(x-\alpha )^3}{\beta ^5} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^5}{\beta ^7} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}, \\ f_3(x,\alpha ,\beta )&= \frac{6(x-\alpha )^2}{\beta ^4} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{28(x-\alpha )^4}{\beta ^6} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^6}{\beta ^8} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}. \end{aligned}$$If we replace \((\alpha ,\beta )\) by any estimator \(({\widehat{\alpha }}_n,{\widehat{\beta }}_n)\) with linear representation, the second and third term converge to zero in probability, since, by the results given in the proof of part a),
$$\begin{aligned}&\frac{1}{n} \sum _{j=1}^n \frac{X_j(X_j^2-1)}{(1+X_j^2)^3} \overset{a.s.}{\longrightarrow }\mathbb {E}\frac{X_1(X_1^2-1)}{(1+X_1^2)^3} \ = \ 0, \end{aligned}$$(19)$$\begin{aligned}&\frac{1}{n} \sum _{j=1}^n \frac{X_j^2 (X_j^2-1)}{(1+X_j^2)^3} \overset{a.s.}{\longrightarrow }\mathbb {E}\frac{X_1^2(X_1^2-1)}{(1+X_1^2)^3} \ = \ 0, \end{aligned}$$(20)and \(\sqrt{n}{\widehat{\alpha }}_n\) and \(\sqrt{n}({\widehat{\beta }}_n-1)\) have limiting normal distributions. Putting \(z=(x-\alpha )/\beta \), the functions \(f_j, j=1,2,3,\) involve terms of the form \(z^k (1+z^2)^{-l}/\beta ^2\), where \(k<2l\). Hence,
$$\begin{aligned} \frac{1}{n} \sum _{j=1}^n |f_j(X_j,{\tilde{\alpha }}_n,{\tilde{\beta }}_n)|&\le \frac{c_j}{{\tilde{\beta }}_n^2}, \quad j=1,2,3, \end{aligned}$$where \(c_j>0\), and \(({\tilde{\alpha }}_n,{\tilde{\beta }}_n)\) lies between \(({\widehat{\alpha }}_n,{\widehat{\beta }}_n)\) and (0, 1). Since \(\sqrt{n}{\widehat{\alpha }}_n^2\), \(\sqrt{n}{\widehat{\alpha }}_n({\widehat{\beta }}_n-1)\) and \(\sqrt{n}({\widehat{\beta }}_n-1)^2\) converge to zero in probability, and \({\tilde{\beta }}_n {\mathop {\longrightarrow }\limits ^{\mathcal {\mathbb {P}}}}1\), we obtain
$$\begin{aligned} \frac{8}{\sqrt{n}} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2} = \frac{8}{\sqrt{n}} \sum _{j=1}^n \frac{X_j^2}{(1+X_j^2)^2} + R_n, \end{aligned}$$where \(R_n{\mathop {\longrightarrow }\limits ^{\mathcal {\mathbb {P}}}}0\) for \(n\rightarrow \infty .\) By a), \(8X_1^2/(1+X_1^2)^2\) has expected value 1 and variance 1/2, and the assertion follows by the central limit theorem. \(\square \)
Remark 3.4
-
(i)
Let \(X\sim F\), and assume that \(8X^2/(1+X^2)^2\) has expected value 1 and finite variance. Then, the test based on \({\widetilde{T}}_{n,0}\) is not consistent against F, i.e. the test is not an omnibus test.
-
(ii)
Typically, the estimation method enters the asymptotic distribution via the functions \(\psi _1\) and \(\psi _2\) in (12) and (13) and the first order terms in the Taylor expansion. Here, this is not the case, because the corresponding expected values in (19) and (20) vanish.
Remark 3.5
Note that the sum appearing on the right hand side of (10) involves essentially the first component of the score function of the Cauchy distribution (see the second part of Remark 3.2). Hence, when using the maximum-likelihood estimator for \((\alpha ,\beta )\), \(\lim _{a\rightarrow \infty } aT_{n,a}\equiv 0\). For other estimators, the limit does not vanish, but a goodness-of-fit test based on it has very low power. Hence, we don’t give further results for this statistic.
4 Consistency
In this section we show that the new tests are consistent against all alternatives satisfying a weak moment condition. Let \(X,X_1,X_2,\ldots \) be i.i.d. random variables with cumulative distribution function F having a unique median as well as unique upper and lower quartiles and \(\mathbb {E}|X|^4/(1+X^2)^2<\infty \). Since the tests \(T_n\) are affine invariant, we assume w.l.o.g. that the true median \(\alpha =0\) and the half interquartile range \(\beta =1\). Under these assumptions we clearly have that \((\widehat{\alpha }_n,\widehat{\beta }_n)\overset{\mathbb {P}}{\longrightarrow }(0,1)\) as \(n\rightarrow \infty \) for the MIQ estimation method in Remark 3.2, for details see Gürtler and Henze (2000), and assume the convergence in all other cases.
Theorem 4.1
Under the standing assumptions, we have
A proof of Theorem 4.1 is deferred to Appendix A.2. In view of the characterization in Theorem 1.1\(\Delta _F=0\) if and only if \(F=\text{ C }(0,1)\). Hence, we conclude that the tests \(T_n\) are able to detect all alternatives satisfying the assumptions of this section.
Remark 4.2
In this remark we fix the weight function \(\omega _1(t)=\exp (-|t|)\), \(t\in \mathbb {R}\). By numerical integration we calculate that if \(X\sim \mathcal {U}(-\sqrt{3},\sqrt{3})\), then \(\Delta _{\mathcal {U}}\approx 2.332019\), if \(X\sim \text{ N }(0,1)\) then \(\Delta _{N}\approx 0.021839\) and if \(X\sim \text{ Log }(0,1)\), then \(\Delta _{L}\approx 0.041495\).
5 Simulation results
In this section, we describe an extensive simulation study to compare the power of the new tests with established ones. For all computations in this and the following section, we used the statistical software R (R Core Team, 2021). All simulations have been done at nominal level \(\alpha =0.05\) and for sample sizes \(n=20\) and \(n=50\). The unknown parameters are estimated either by median and half-IQR estimators or by the method of maximum likelihood, see Remark 3.2. The convergence of the critical values of the new test statistics to its asymptotic values is generally quite slow. Table 1 shows a few selected cases. For \(T_{n,a}\) in (8) with weight \(a=1\), asymptotic values could be used for sample sizes larger than 100; however, for \(a=5\) and the median and half-IQR estimators, sample sizes of 500 or more are necessary. Hence, in a first step, critical values are determined for each test statistic by a Monte Carlo simulation with \(10^5\) replications. In a second step, empirical power of the different tests is calculated based on \(10^4\) replications.
Besides the new tests based on \(T_{n,a}\) with weights \(a=1,2,3,4,5,6\), we use the limiting test \({\widetilde{T}}_{n,0}\) in (18). Power decreased for larger weights for all distributions used in the study; hence, we omit the results for weights larger than 6 as well as for the test based on the limiting statistic \(T_{n,\infty }\) in (10), see also Remark 3.5. Moreover, we use the test based on the KL distance described in Mahdizadeh and Zamanzade (2017). For the nonparametric estimation of the entropy, the authors recommend the window size \(m=4\) for \(n=20\) and \(m=20\) for \(n=50\) (Mahdizadeh and Zamanzade 2017, p. 1109), and we followed this suggestion. For computing the Vasicek estimate of differential Shannon entropy, we used the function entropy.estimate in the R package vsgoftest (Lequesne and Regnault 2020). From the classical tests that utilize the empirical distribution function (so-called edf tests), we choose the following: Kolmogorov-Smirnov test (KS), Cramér-von Mises test (CM), Anderson-Darling test (AD) and Watson test (W). Finally, we apply the tests based on the statistics \(D_{n,\lambda }\) considered in Gürtler and Henze (2000) and Matsui and Takemura (2005), with weights \(\lambda =1,2,3,4,5,6\).
Of course, there are further goodness of fit tests for the Cauchy distribution. For example, one can use the procedure proposed by Chen and Balakrishnan (2012) which consists in transforming the observations to approximately normally distributed random variables, and then applying any edf test for normality with estimated parameters. For the latter, there exist tables and software, so there is no need for, say, a parametric bootstrap procedure. This is a nice feature for distributions with shape parameters or in more complicated situations (Klar and Meintanis 2012; Goldmann et al. 2015). However, for location-scale families and parameter-free test statistics, where the null distribution can be obtained by Monte Carlo simulation, this is only a minor advantage. On the other hand, these tests don’t maintain exactly the theoretical level. Further, besides the results derived in Goldmann et al. (2015), not much is known about asymptotic properties of the tests.
Similarly as in Chen and Balakrishnan (2012), Villaseñor and González-Estrada (2021) suggest to transform the observations to approximately exponentially distributed random variables. Then, the authors apply the Anderson-Darling test for exponentiality with estimated parameters. They use this procedure to test for the Cauchy distribution, but recommend to obtain critical values by Monte Carlo simulation. Their results show that neither the classical Anderson-Darling test nor the Anderson-Darling tests based on the two transformations outperforms the other two in all cases. Clearly, one could construct a multitude of further tests by (approximate) transformations to an arbitrary continuous distribution, and using any of the well-known edf test statistics. For the above reasons, and since we don’t aim at a complete overview over gof tests for the Cauchy distribution, we decided not to include such transformation based tests in our power study.
Besides the standard Cauchy and normal distribution (C(0, 1) and N(0, 1), for short), we use Cauchy-normal mixtures CN\((p)=(1-p)\)C\((0,1)+p\)N(0, 1) with \(p=0.5\) and \(p=0.8\). Further, we employ Student’s t-distribution (Student(k)) with \(k=2,3,5,10\) degrees of freedom, the (symmetric) Tukey g-h distribution with \(g=0\) and \(h=0.2, 0.1, 0.05\), denoted by Tukey(h), and the Tukey lambda distribution (Tukey-L\((\lambda )\)) with \(\lambda =-3,-2,-0.5,0.5\). Note that the Tukey lambda distribution approximates the Cauchy distribution for \(\lambda =-1\), it coincides with the logistic distribution for \(\lambda =0\), it is close to the normal for \(\lambda =0.14\), and coincides with the uniform distribution for \(\lambda =1\). In the family of stable distributions S\((\alpha ,\beta )\), we choose symmetric distributions S\((\alpha ,0)\) with \(\alpha =0.4,0.7,1.2,1.5,1.8\), and skewed distributions S\((\alpha ,1)\) with \(\alpha =0.5,1,1.5,2\). Moreover, we use the uniform, logistic, Laplace, Gumbel and exponential distribution, and the Mittag-Leffler distribution ML\((\alpha )\) with \(\alpha =0.25,0.5,0.75\). The ML\((\alpha )\) distribution has Laplace transform \((1+t^\alpha )^{-1}\), and coincides with the exponential distribution for \(\alpha =1\).
Power estimates of the tests under discussion are given in Tables 2, 3, 4 and 5. All entries are the percentage of rejection of \(H_0\), rounded to the nearest integer. In Tables 2 and 3, the results using the median and interquartile-distance estimators are given, with sample size \(n=20\) and \(n=50\), respectively. Tables 4 and 5 show the corresponding results for the maximum likelihood estimator.
The main conclusions that can be drawn from the simulation results are the following:
-
As always in similar situations, there exists no uniformly most powerful test, which is in accordance to the results in Janssen (2000).
-
As expected, the power of nearly all test statistics increases for increasing sample size for all alternative distributions. An exception is the KL distance based test. Its power decreases for certain alternatives, in particular for the Mittag-Leffler distribution. This behavior has been reported previously, see Tables 3 to 6 in Mahdizadeh and Zamanzade (2019), and may be due to the choice of the window size.
-
The new tests \(T_{n,a}, a=1,\ldots ,6\), perform better using the maximum likelihood estimator than with median and half-IQR. Hence, the latter estimators should not be used. For all other test statistics, including \({\widetilde{T}}_{n,0}\), performance is comparable, or even better when using median and half-IQR. In any case, the choice of the estimation method can have a pronounced influence on the performance of the tests.
-
The power of \({\widetilde{T}}_{n,0}\) is comparable under both estimation methods.
-
For alternatives with finite first and second moments, the KL distance based test has the highest power among all competitors. On the other hand, its power breaks down completely for some alternatives without existing first moment, and it is low for some alternatives with infinite second moment. Hence, the test can not really be seen as an omnibus test.
-
Among the new tests, values of the tuning parameter around \(a=3\) result in a quite homogeneous power against all alternatives. For the tests based on \(D_{n,\lambda }\), \(\lambda =5\) seems to be a good choice. Both classes of tests perform similarly for the ML estimator; for the median and half-IQR estimator, the latter is preferable.
-
Among the group of edf tests, W outperform the other tests in most cases.
-
For symmetric alternatives without existing first moment as Stable(0.4,0), Stable(0.7,0), Tukey-L(-3) and Tukey-L(-2), the tests based on \({\widetilde{T}}_{n,0}, AD\) and \(D_{n,6}\) perform best.
6 Real data example: log-returns of cryptocurrencies
In this section, we apply the tests for the Cauchy distribution to log-returns of various cryptocurrencies, namely Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Litecoin (LTC), BitcoinCash (BCH), EOS (EOS), BinanceCoin (BNB) and Tron(TRX). The Cauchy distribution is found to be a comparably good model for such data sets in Szczygielski et al. (2020). There, 58 hypothetical distributions have been fitted to 15 major cryptocurrencies, and the Cauchy model turned out to be the best fitting distribution for 10 cryptocurrencies (including all currencies considered here). In Szczygielski et al. (2020), the number of observations of the various data sets varied widely from 638 to 2255. Further, with very large data sets, each model will be rejected in the end. Hence, we decided to consider shorter time series: a series with daily observations from January 01, 2020 to June 10, 2021, with sample size 527 (526 for ETH), and an even shorter series from January 01, 2021 to June 10, 2021, with sample size 161 (160 for ETH). All prices are closing values in U.S. dollars, freely accessible from CryptoDataDownload via the link www.cryptodatadownload.com/data. Returns are estimated by taking logarithmic differences. Days with zero trading volume are omitted. Figure 1 shows histogramms of the datasets with larger time span, together with the densities of fitted Cauchy distributions. Visually, the Cauchy model seems to be a reasonable approximation.
As in the above cited literature, we assume in the following the independence of daily log return data.This is justified in view of the very weak serial correlations of all datasets. Figure 2 shows the autocorrelations of the first four datasets with larger time span, which are all quite small and not significant in most cases. The findings are similar for the other four cryptocurrencies and the shorter time span.
Tables 6 and 7 report the results of the different tests for the Cauchy model for the two time series. The p-values are based on Monte Carlo simulation under the Cauchy model with \(10^4\) replications. For the test based on the KL distance, we choose the window length \(m=100\) for the longer time series, and \(m=50\) for the shorter one. The results show that even if the Cauchy distribution fits better than many other distributional models, it is still not an acceptable fit in any of the cases, possibly with the exception of EOS data. The tests based on \(T_{n,5}, KL\) and \(D_{n,5}\) result in p-values of 0.000 for all currencies for the longer time series, and p-values smaller than 0.01 for all currencies for the shorter one. The edf based tests have generally larger p-values, with the Watson test having smallest, and Cramér–von Mises test having largest p-values.
Overall, the EOS data seem to be most compatible with the hypothesis of a Cauchy distribution. For the shorter series, the p-values of all edf based tests are larger than 0.1, and the tests based on \(T_{n,1}, {\tilde{T}}_{n,0}, KS\) and CM don’t reject \(H_0\) on the 5%-level even for the larger data set. These tests also show relatively large values for Ripple (XRP) and BitcoinCash (BCH). On the whole, the results confirm the findings of the simulation study concerning the comparative power of the test.
7 Conclusion
We have proposed a family of tests for the Cauchy family of distributions with desirable theoretical and computational features as consistency and competitiveness to existing procedures. The authors suggest in view of the results of the simulation study to use the tuning parameter \(a=4\) in applications for good power against most alternatives.
We conclude the paper by pointing out some open problems for further research. The explicit formula of the covariance kernels in Remark 3.2 open ground to numerical approximation of the eigenvalues of the integral operator \(\mathcal {K}\). Especially an approximation of the largest eigenvalue would offer more theoretical insights, since, as is well known, it is useful for efficiency statements in the sense of Bahadur, see Bahadur (1960) and Nikitin (1995). A profound knowledge of the eigenvalues leads to explicit values of cumulants of the null distribution, such that a fit of the null distribution to the Pearson distribution system is possible. This usually gives a very accurate approximation of the quantiles of the distribution and hence of the critical values of the test, see i.e. in the context of normality testing (Ebner 2020; Henze 1990). In view of the results of Baringhaus et al. (2017), we conjecture that for a large class of fixed alternatives \(\sqrt{n}(T_n/n-\Delta _F)\) converges weakly to a centred normal distribution, where the variance is positive and may be consistently estimated. Since the derivation involves heavy algebra, the authors postpone results of such kind to future work. Another desirable extension for the family of tests is a data dependent choice of the tuning parameter a, see Tenreiro (2019) for a procedure which might be applicable. Finally, it is clear that the choice of the weight function \(\omega _a(t)\) has an impact on the power of the tests, so maybe other choices lead to better performances.
References
Anastasiou A, Barp A, Briol F-X, Ebner B, Gaunt RE, Ghaderinezhad F, Gorham J, Gretton A, Ley C, Liu Q, Mackey L, Oates CJ, Reinert G, Swan Y (2022) Stein’s method meets computational statistics: a review of some recent developments. Stat Sci. https://doi.org/10.1214/22-STS863
Bahadur RR (1960) Stochastic comparison of tests. Ann Math Stat 31:276–295
Baringhaus L, Ebner B, Henze N (2017) The limit distribution of weighted \({L}^2\)-Goodness-of-Fit statistics under fixed alternatives, with applications. Ann Inst Stat Math 69(5):969–995
Besbeas P, Morgan BJ (2001) Integrated squared error estimation of Cauchy parameters. Stat Probab Lett 55:397–401
Chen G, Balakrishnan N (2012) A general purpose approximate goodness-of-fit test. J Qual Technol 27:154–161
Chen Z (2011) A simple method for estimating parameters of the location-scale distribution family. J Stat Comput Simul 81(1):49–58
Cohen Freue GV (2007) The Pitman estimator of the Cauchy location parameter. J Stat Plan Inference 137(6):1900–1913
Ebner B (2020) On combining the zero bias transform and the empirical characteristic function to test normality. ALEA 18:1029–1045
Fegyverneki S (2013) A simple robust estimation for parameters of Cauchy distribution. Miskolc Math. Notes 14(3):887–892
Goldmann G, Klar B, Meintanis S (2015) Data transformations and goodness-of-fit tests for type-II right censored samples. Metrika 78:59–83
Gürtler N, Henze N (2000) Goodness-of-fit tests for the Cauchy distribution based on the empirical characteristic function. Ann Inst Stat Math 52(2):267–286
Henze N (1990) An approximation to the limit distribution of the Epps-Pulley test statistic for normality. Metrika 37(1):7–18
Janssen A (2000) Global power functions of goodness of fit tests. Ann Stat 28(1):239–253
Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1, 2nd edn. Wiley, New York
Klar B, Meintanis S (2012) Specification tests for the response distribution in generalized linear models. Comput Stat 27:251–267
Lequesne J, Regnault P (2020) vsgoftest: An R package for goodness-of-fit testing based on Kullback–Leibler divergence. J Stat Softw 96(1):1–26
Litvinova VV (2005) Two criteria of goodness-of-fit for Cauchy distributions based on characterizations. J Math Sci 127(1):1752–1756
Mahdizadeh M, Zamanzade E (2017) New goodness of fit tests for the Cauchy distribution. J Appl Stat 44(6):1106–1121
Mahdizadeh M, Zamanzade E (2019) Goodness-of-fit testing for the Cauchy distribution with application to financial modeling. J King Saud Univ 31(4):1167–1174
Maplesoft.: Maple. Maplesoft, Waterloo (2019)
Matsui M, Takemura A (2005) Empirical characteristic function approach to goodness-of-fit tests for the Cauchy distribution with parameters estimated by mle or eise. Ann Inst Stat Math 57(1):183–199
Nikitin Y (1995) Asymptotic efficiency of nonparametric tests. Cambridge University Press, Cambrige
Nolan JP (2020) Univariate stable distributions: models for heavy tailed data. Springer series in operations research and financial engineering. Springer, Cham
Norton RM (1983) A characterization of the Cauchy distribution. Sankhyā 45:247–252
Onen BH, Dietz DC, Yen VC, Moore AH (2001) Goodness-of-fit tests for the Cauchy distribution. Comput Stat 16(1):97–107
R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Rublik F (2001) A quantile goodness-of-fit test for Cauchy distribution, based on extreme order statistics. Appl Math 46(5):339–351
Serfling RJ (1980) Approximation theorems of mathematical statistics. Wiley, New York
Stein C, Diaconis P, Holmes S, Reinert G (2004) Use of exchangeable pairs in the analysis of simulations. In: Diaconis P, Holmes S (eds) Stein’s method, volume 46 of Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Beachwood, pp 1–25
Stigler SM (1974) Studies in the history of probability and statistics. XXXIII: Cauchy and the witch of Agnesi: An historical note on the Cauchy distribution. Biometrika 61:375–380
Szczygielski J, Karathanasopoulos A, Zaremba A (2020) One shape fits all? A comprehensive examination of cryptocurrency return distributions. Appl. Econ. Lett. 27:1567–1573
Tenreiro C (2019) On the automatic selection of the tuning parameter appearing in certain families of goodness-of-fit tests. J Stat Comput Simul 89(10):1780–1797
Villaseñor J, González-Estrada E (2021) Goodness-of-fit tests for Cauchy distributions using data transformations. In: Ghosh I, Balakrishnan N, Ng H (eds) Advances in statistics: theory and applications. Emerging topics in statistics and biostatistics. Springer, Berlin
Zhang J (2002) Powerful goodness-of-fit tests based on the likelihood ratio. J R Stat Soc B 64(2):281–294
Acknowledgements
The authors are grateful for the suggestions and comments of three anonymous reviewers which helped to improve the paper.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Proofs
Proofs
1.1 Proof of Theorem 3.1
Recall that \(T_n=\Vert Z_n\Vert ^2_\mathbb {H}\) with \(Z_n\) defined in (11). In view of the affine invariance of \(T_n\) we assume w.l.o.g. \(\alpha =0\) and \(\beta =1\). Denoting
we have \(Z_n(t)=\frac{1}{\sqrt{n}}\sum _{j=1}^nh(t,Y_{n,j})\). A simple change of variable and a multivariate Taylor expansion around \((\alpha ,\beta )=(0,1)\) shows
where
and
The remainder term \(R_j(u)=\left( \widehat{\alpha }_n,\widehat{\beta }_n-1\right) A_j(u,\xi _n) \left( \widehat{\alpha }_n,\widehat{\beta }_n-1\right) ^\top \), \(t\in \mathbb {R}\), and \(A_j(u,\xi _n)\) is the \(2\times 2\) matrix given by the second order partial derivatives in the Taylor expansion. Here, each component of the matrix is evaluated at \(\xi _n=(\xi _{1,n},\xi _{2,n})\in [0,\widehat{\alpha }_n]\times [1,\widehat{\beta }_n]\), where the order of the endpoints of the intervals is reversed whenever \(\widehat{\alpha }_n<0\) or \(\widehat{\beta }_n<1\). Note that by assumption, we have
with \(r_{1n},r_{2n}=o_\mathbb {P}(1)\) and \(\mathbb {E}[\psi _1(X)]=\mathbb {E}[\psi _2(X)]=0\) and \(\mathbb {E}[\psi (X)\psi (X)^\top ]=C\cdot I_{2}\), where \(X\sim \text{ C }(0,1)\), \(\psi (X)=(\psi _1(X),\psi _2(X))^\top \), C is a positive constant and \(I_{2}\) and is the \(2\times 2\) unit matrix. This shows that \((\widehat{\alpha }_n,\widehat{\beta }_n)\overset{\mathbb {P}}{\rightarrow }(0,1)\) and consequently \(\xi _n\overset{\mathbb {P}}{\rightarrow }(0,1)\) when \(n\rightarrow \infty \). Next, we denote by \(\Vert \cdot \Vert _{\mathbb {H}_n}\) the norm in the Hilbert space \(\mathbb {H}_n=L^2(\mathbb {R},\mathcal {B}, \omega (\widehat{\beta }_n u)\text{ d }u)\), equipped with the usual scalar product. By the Cauchy-Schwarz inequality, we have
where \(\Vert \cdot \Vert _F\) denotes the Frobenius norm. Denote by \(A_{j,1}(u,\xi _n)\) the upper left component of the matrix \(A_j(u,\xi _n)\). Writing \(\kappa _{m,\ell }(x)=x^{m}(1+x^2)^{-\ell }\) for all \(x\in \mathbb {R}\) and \(m,\ell \in \mathbb {N}\), we have \(|\kappa _{m,\ell }(x)|\le 1\) for \(m\le \ell \in \mathbb {N}\), and hence
follows. Similar expressions hold for all components of the matrix \(A_j(u,\xi _n)\). Therefore by a change of variable we have
where C is a positive constant.
Hence, defining
we have by the triangle inequality and (25)
The last term is \(o_{\mathbb {P}}(1)\) since \(\sqrt{n}\left( \widehat{\alpha }_n,\widehat{\beta }_n-1\right) \) is tight due to the linearity assumption (24) and the multivariate central limit theorem, \(\xi _{2,n}{\mathop {\longrightarrow }\limits ^{\mathcal {\mathbb {P}}}}1\), the continuous mapping theorem, as well as (7) and \(\left( \widehat{\alpha }_n,\widehat{\beta }_n-1\right) =o_{\mathbb {P}}(1)\). This concludes that \(\Vert Z_n\Vert _{\mathbb {H}}^2=\widehat{\beta }_n\Vert Z_n^*\Vert _{\mathbb {H}_n}^2+o_{\mathbb {P}}(1)\).
We obtain using symmetry arguments
By applying the Cauchy-Schwarz inequality, we directly see that both functions belong to \(\mathbb {H}\), since
By the law of large numbers in Hilbert spaces, we have for \(l=1,2\)
Replacing \(\sqrt{n}\widehat{\alpha }_n\) and \(\sqrt{n}({\widehat{\beta }}_n-1)\) by the linear representations (24), and substituting the arithmetic means by their asymptotic counterparts leads to
Since for all \(a_1,a_2\in \mathbb {R}\), \((a_1+a_2)^2\le 2a_1^2+2a_2^2\) holds, we have
By (27), the tightness of \(\frac{1}{\sqrt{n}}\sum _{j=1}^n\psi _l(X_j),l=1,2\) and Slutzky’s Theorem, we have \(\Vert Z_n^*-{\widetilde{Z}}_n\Vert _{\mathbb {H}}\overset{\mathbb {P}}{\longrightarrow }0\), as \(n\rightarrow \infty \). Hence \(Z_n^*\) and \({\widetilde{Z}}_n\) share the same weak limit in \(\mathbb {H}\). Writing \({\widetilde{Z}}_n(t)=n^{-1/2}\sum _{j=1}^n{\widetilde{Z}}_{n,j}(t)\), \(t\in \mathbb {R}\), where \({\widetilde{Z}}_{n,j}(t)= h(t,X_j) -\frac{1}{2}\big (t^2+\vert t\vert +1\big )e^{-\vert t\vert }\psi _1(X_j)+\frac{1}{2}t\big (\vert t\vert +1\big )e^{-\vert t\vert }\psi _2(X_j)\), \(t\in \mathbb {R}\), \(j=1,\ldots ,n,\) we have \(\mathbb {E}[{\widetilde{Z}}_{n,1}(t)]=0\) and \({\widetilde{Z}}_{n,1},{\widetilde{Z}}_{n,2},\ldots \) are i.i.d. centred random elements in \(\mathbb {H}\). Straightforward calculations show that \(K(s,t)=\mathbb {E}[{\widetilde{Z}}_{n,1}(s){\widetilde{Z}}_{n,1}(t)]\) has the stated formula and the weak limit of \({\widetilde{Z}}_n\) follows by the central limit theorem in Hilbert spaces. To conclude the proof, we have by the Cauchy-Schwarz inequality
Since by assumption \(\omega ^{1/2}\) is a weight function, we directly see using the Cauchy-Schwarz inequality and the continuous mapping theorem that \(\left( \int _{-\infty }^\infty \left( Z_n^*(t)\right) ^4 \omega (t)\,dt\right) ^{1/2}\) is tight. Since by direct calculations \(\int _{-\infty }^\infty \left| \frac{\omega (\widehat{\beta }_n t)}{\omega (t)}-1\right| ^2\omega (t)\,dt=o_{\mathbb {P}}(1)\) holds, we see that \(\Vert Z_n\Vert _{\mathbb {H}}^2=\widehat{\beta }_n\Vert Z_n^*\Vert _{\mathbb {H}}^2+o_{\mathbb {P}}(1)\). Finally, by Slutzky’s theorem, \(\Vert Z_n\Vert _{\mathbb {H}}^2\) has the same weak limit as \(\Vert {\widetilde{Z}}_n\Vert _{\mathbb {H}}^2\) and the assertion follows.\(\square \)
1.2 Proof of Theorem 4.1
Let \(Z_n\) and \(Z_n^*\) be defined as in (11) and (26). Since by assumption we have \((\widehat{\alpha }_n,\widehat{\beta }_n)\overset{\mathbb {P}}{\longrightarrow }(0,1)\), we can show in complete analogy to the proof of Theorem 3.1, that \(n^{-1}\left| \Vert Z_n\Vert _{\mathbb {H}}^2-\Vert Z_n^*\Vert _{\mathbb {H}}^2\right| \overset{\mathbb {P}}{\longrightarrow }0\) holds. Let \(g_j(\cdot ,\cdot )\), \(j=1,2\), be defined as in (22) and (23), and put \(Z_n^0(t)=n^{-1/2}\sum _{j=1}^nh(t,X_j)\), \(t\in \mathbb {R}.\) Then \(n^{-1/2}\big (Z_n^*(t)-Z_n^0(t)\big ) =\widehat{\alpha }_n/n\sum _{j=1}^ng_1(t,X_j)+(\widehat{\beta }_n-1)/n\sum _{j=1}^ng_2(t,X_j) \) follows and by the triangle inequality we have
By the law of large numbers in Hilbert spaces and \((\widehat{\alpha }_n,\widehat{\beta }_n)\overset{\mathbb {P}}{\longrightarrow }(0,1)\), the right hand side is \(o_\mathbb {P}(1)\). Note that the existence of the mean values is guaranteed by the assumptions. Again, the law of large numbers in \(\mathbb {H}\) shows \(n^{-1/2}Z_n^0(t)\overset{a.s.}{\longrightarrow }\mathbb {E}h(t,X)\), as \(n\rightarrow \infty \), and by the symmetry of the weight function, we have
as \(n\rightarrow \infty \). \(\square \)
1.3 Covariance kernel under EISE estimators
We have for the covariance kernel in Theorem 3.1 in case of EISE estimators
where \({{\,\textrm{sgn}\,}}(\cdot )\) is the sign function, and
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ebner, B., Eid, L. & Klar, B. Cauchy or not Cauchy? New goodness-of-fit tests for the Cauchy distribution. Stat Papers 65, 45–78 (2024). https://doi.org/10.1007/s00362-022-01382-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-022-01382-0