Abstract
We study the Bahadur efficiency of several weighted L2-type goodness-of-fit tests based on the empirical characteristic function. The methods considered are for normality and exponentiality testing, and for testing goodness-of-fit to the logistic distribution. Our results are helpful in deciding which specific test a potential practitioner should apply. For the celebrated BHEP and energy tests for normality we obtain novel efficiency results, with some of them in the multivariate case, while in the case of the logistic distribution this is the first time that efficiencies are computed for any composite goodness-of-fit test.
Similar content being viewed by others
Notes
The original weight function also includes the constant \(2\sqrt{\pi } \varGamma (1-(\gamma /2))/(\gamma 2^\gamma \varGamma ((1+\gamma )/2))\).
As a numerical confirmation, we have compared the efficiencies obtained by using Baringhaus’ eigenvalues with those using the Monte Carlo method and found them to be quite close in all cases.
For the hybrid case we only present results for contamination alternatives as location alternative are clearly excluded and efficiencies for the correlation and scale alternatives coincide with those of the simple hypothesis case.
References
Allison J, Santana L (2015) On a data-dependent choice of the tuning parameter appearing in certain goodness-of-fit tests. J Stat Comput Simul 85(16):3276–3288
Bahadur R (1967) An optimal property of the likelihood ratio statistic. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 13–26
Bahadur RR (1971) Some limit theorems in statistics. SIAM
Baringhaus L (1996) Fibonacci numbers, Lucas numbers and integrals of certain Gaussian processes. Proc Am Math Soc 124(12):3875–3884
Božin V, Milošević B, Nikitin YY, Obradović M (2020) New characterization-based symmetry tests. Bull Malays Math Sci Soc 43(1):297–320
Cuparić M, Milošević B, Obradović M (2019) New \({L}^2\)-type exponentiality tests. SORT Stat Oper Res Trans 43(1):25–50
Cuparić M, Milošević B, Obradović M (2022) New consistent exponentiality tests based on \( V \)-empirical Laplace transforms with comparison of efficiencies. Revista de la Real Academia de Ciencias Exactas, Físicas y Naturales Serie A Matemáticas 116(42):1–26
Drost F, Kallenberg W, Oosterhoff J (1990) The power of EDF tests of fit under non-robust estimation of nuisance parameters. Stat Decis 8:167–182
Ebner B, Henze N (2021) Bahadur efficiencies of the Epps–Pulley test for normality. Zapiski Nauchnykh Seminarov POMI 501:302–314
Ebner B, Henze N (2022) On the eigenvalues associated with the limit null distribution of the Epps–Puley test for normality. Stat Papers. https://doi.org/10.1007/s00362-022-01336-6
Epps T, Pulley L (1983) A test for normality based on the empirical characteristic function. Biometrika 70(3):723–726
Gradshteyn I, Ryzhik I (1994) Tables of integrals, series, and products. Academic Press, New York
Grané A, Fortiana J (2011) A directional test of exponentiality based on maximum correlations. Metrika 73(2):255–274
Grané A, Tchirina A (2013) Asymptotic properties of goodness-of-fit test based on maximum correlations. Statistics 47(1):202–205
Gregory GG (1980) On efficiency and optimality of quadratic tests. Ann Stat 8(1):116–131
Gulati S, Shapiro S (2009) A new goodness of fit test for the logistic distribution. J Stat Theory Pract 3(3):567–576
Gürtler N, Henze N (2000) Goodness-of-fit tests for the Cauchy distribution based on the empirical characteristic function. Ann Inst Stat Math 52(2):267–286
Henze N, Wagner T (1997) A new approach to the BHEP tests for multivariate normality. J Multivar Anal 62(1):1–23
Henze N, Meintanis SG (2005) Recent and classical tests for exponentiality: a partial review with comparisons. Metrika 61(1):29–45
Jones MC (2015) On families of distributions with shape parameters. Int Stat Rev 83(2):175–192
Kato T (2013) Perturbation theory for linear operators, vol 132. Springer
Ley C (2015) Flexible modelling in statistics: past, present and future. Journal de la Société Française de Statistique 156(1):76–96
Meintanis SG (2004) Goodness-of-fit tests for the logistic distribution based on empirical transforms. Sankhyā Indian J Stat 66(2):306–326
Meintanis SG, Swanepoel J (2007) Bootstrap goodness-of-fit tests with estimated parameters based on empirical transforms. Stat Probab Lett 77(10):1004–1013
Milošević B (2016) Asymptotic efficiency of new exponentiality tests based on a characterization. Metrika 79(2):221–236
Milošević B, Obradović M (2016) Some characterization based exponentiality tests and their Bahadur efficiencies. Publications de L’Institut Mathematique 100(114):107–117
Milošević B, Nikitin YY, Obradović M (2021) Bahadur efficiency of EDF based normality tests when parameters are estimated. Zapiski Nauchnykh Seminarov POMI 501:203–217
Móri TF, Székely GJ, Rizzo ML (2021) On energy tests of normality. J Stat Plan Inference 213:1–15
Nikitin YY (1995) Asymptotic efficiency of nonparametric tests. Cambridge University Press, New York
Nikitin YY, Peaucelle I (2004) Efficiency and local optimality of nonparametric tests based on U- and V-statistics. Metron Int J Stat LXII(2):185–200
Nikitin YY, Tchirina AV (1996) Bahadur efficiency and local optimality of a test for the exponential distribution based on the Gini statistic. J Ital Stat Soc 5(1):163–175
Nikitin YY, Volkova KY (2016) Efficiency of exponentiality tests based on a special property of exponential distribution. Math Methods Stat 25(1):54–66
Rublík F (1989) On optimality of the LR tests in the sense of exact slopes I. General case. Kybernetika 25(1):13–14
Stephens MA (1979) Tests of fit for the logistic distribution based on the empirical distribution function. Biometrika 66(3):591–595
Subba Rao S (2017) Lecture notes: advanced statistical inference. http://web.stat.tamu.edu/~suhasini/teaching613/teaching613_2017.html
Székely GJ, Rizzo ML (2005) A new test for multivariate normality. J Multivar Anal 93(1):58–80
Székely GJ, Rizzo ML (2013) Energy statistics: a class of statistics based on distances. J Stat Plan Inference 143(8):1249–1272
Tenreiro C (2009) On the choice of the smoothing parameter for the BHEP goodness-of-fit test. Comput Stat Data Anal 53(4):1038–1053
Tenreiro C (2019) On the automatic selection of the tuning parameter appearing in certain families of goodness-of-fit tests. J Stat Comput Simul 89(10):1780–1797
Villa C (2016) A property of the Kullback–Leibler divergence for location-scale models. arXiv preprint arXiv:1604.01983
Funding
Funding is provided by the Ministry of Science, Technological Development and Innovation of the Republic of Serbia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work of B. Milošević is supported by the Ministry of Education, Science and Technological Development of Republic of Serbia.
Appendix
Appendix
1.1 Regularity conditions A
There exists \(\delta >0\) such that for \(|\theta |\le \delta \) it holds:
-
A1
Let MLEs \({\hat{\mu }}\) and \({\hat{\sigma }}\) of \(\mu \) and \(\sigma \) exist (under the null model) such that \({\hat{\mu }}\overset{P_{\theta }}{\rightarrow }\mu (\theta )\) and \({\hat{\sigma }}\overset{P_{\theta }}{\rightarrow }\sigma (\theta )\);
-
A2
Functions \(\mu (\theta )\) and \(\sigma (\theta )\) are three times continuously differentiable;
-
A3
Functions \(L_1(x;\theta )=\log g_{\theta }(x)\) and \(L_2(x;\theta )=\log \Big (\frac{1}{\sigma (\theta )}f_0\Big (\frac{x-\mu (\theta )}{\sigma (\theta )}\Big )\Big )\) are three times differentiable;
-
A4
\(\Big |\frac{\partial ^{i}L_k(x;\theta )}{\partial \theta ^i}\frac{\partial ^{j}g_{\theta }(x)}{\partial \theta ^j}\Big |<M_{i,j}(x)\) for \(i,j=0,1,2\), \(i+j\le 3\), \(k=1,2\), where \(M_{i,j}(x)\) are integrable functions.
Proof of Theorem 3.1
Let \(\mu =\mu (\theta )\) and \(\sigma =\sigma (\theta )\) be the values that minimize (3.1), their existence following from Condition A1, by considering the sample Kullback–Leibler distance \(\frac{1}{n}\sum _{i=1}^n\log \Big (\frac{g_{\theta }(x)}{\frac{1}{\sigma } f_0(\frac{x-\mu }{\sigma })}\Big )\) as in Villa (2016).
Conditions A2–A4 enable differentiation under the integral sign. Hence, differentiating (3.1) with respect to \(\theta \) we get
It is easy to show that \(K'(0)=0\). Differentiating (3.1) once more we obtain at \(\theta =0\),
where \(h(x)=\frac{\partial }{\partial \theta }g_{\theta }(x)|_{\theta =0}\) and \(u(x)=\frac{\partial ^2}{\partial \theta ^2}g_{\theta }(x)|_{\theta =0}\). Using the change of variable, rearranging terms in the above expression, and expanding \(K(\theta )\) into a Maclaurin expansion completes the proof. \(\square \)
1.2 Regularity conditions B
Let the test statistic be a V-statistic of the form
for some degenerate kernel \(\varPhi \). Then \(b(\theta )\) has representation
where
To allow a Taylor expansion of \(b(\theta )\), suppose the following regularity conditions similar to A1-A4 are satisfied.
-
B1
Let \({\hat{\mu }}\) and \({\hat{\sigma }}\) be consistent estimators of \(\mu (\theta )\) and \(\sigma (\theta )\), i.e. \({\hat{\mu }}\overset{P_{\theta }}{\rightarrow }\mu (\theta )\) and \({\hat{\sigma }}\overset{P_{\theta }}{\rightarrow }\sigma (\theta )\);
-
B2
Functions \(\mu (\theta )\) and \(\sigma (\theta )\) are three times continuously differentiable;
-
B3
Function \({\widetilde{g}}(x;\theta )\) is three times differentiable;
-
B4
\(\Big |\frac{\partial ^{i}{\widetilde{g}}(x;\theta )}{\partial \theta ^i}\frac{\partial ^{j}{\widetilde{g}}(y;\theta )}{\partial \theta ^j}\Big |<{\widetilde{M}}_{i,j}(x,y)\) for \(i,j=0,1,2\), \(i+j\le 3\), \(k=1,2\), where \({\widetilde{M}}_{i,j}(x,y)\) are integrable functions.
Following Nikitin and Peaucelle (2004) and arguments therein the regularity conditions can be expressed in terms of \({\widetilde{g}}(z;\theta ),\) i.e. Assumptions WD:
-
WD1
$$\begin{aligned} \int _{0}^1t^3(1-t)^3\int _{-\infty }^{\infty }\varPhi (z_1,z_2){\widetilde{g}}'''(z_1;t\theta ){\widetilde{g}}(z_2;t\theta )\textrm{d}z_1\textrm{d}z_2\textrm{d}t<\infty ; \end{aligned}$$
-
WD2
$$\begin{aligned} \int _{0}^1t^3(1-t)^3\int _{-\infty }^{\infty }\varPhi (z_1,z_2){\widetilde{g}}''(z_1;t\theta ){\widetilde{g}}'(z_2;t\theta )\textrm{d}z_1\textrm{d}z_2\textrm{d}t<\infty . \end{aligned}$$
Remark
If the kernel function \(\varPhi (\cdot )\) is uniformly bounded, then conditions B3 and B4 can be simplified by replacing \({\widetilde{g}}\) with g, while WD1 and WD2 follow from B3 and B4.
By way of example we consider the regularity conditions for the second Ley-Paindaveine alternative to the logistic null distribution. (If we consider the same alternative in the case of the normal distribution the regularity conditions may be shown by analogous arguments). Specifically for regularity conditions A, note that the density is:
(For simplicity and without loss of generality here we can set the null parameters to \(\mu =0\) and \(\sigma =1\)).
Condition A1. Given the MLEs for \(\mu \) and \(\sigma \) in the logistic model, the consistency of estimators under the considered alternative can be justified using arguments from Theorem 2.6.1. (from Subba Rao (2017)). In particular, the functions \(\log g_\theta (x)\) and \(\frac{\partial ^2}{\partial \theta ^2}\log g_{\theta }(x)\) are Lipschitz-continuous for \(|\theta |\le \delta \), the parameter space is compact (\(\{\theta : |\theta |\le \delta \}\)), and point-wise convergence of the log-likelihood and its second derivative to their expectations hold due to law of large numbers.
Conditions A2 follows from the implicit function theorem.
Condition A3 is obviously satisfied.
Condition A4. Let \(|\theta |\le \delta \). Then functions \(\mu (\theta )\) and \(\sigma (\theta )\) are bounded due to their continuity. The integrable dominants are available from the following bounds, where all constants C and \(C_j\) depend on \(\delta \) (and are different in each equation).
From the inequality \(|\log \frac{e^{-x}}{(1+e^{-x})^2}|\le |x|+\ln 2\) we get
For the derivatives of \(L_1\) we have
The expressions for the derivatives of \(L_2\) are too cumbersome to display. However, using boundness of functions \(\frac{e^{-ax}}{(1+e^{-x})^b}\) for \(a\le b\) and boundedness of derivatives of \(\mu (\theta )\) and \(\sigma (\theta )\), we get
The density \(g_\theta (x)\) itself and its first derivative is simply bounded by the standard logistic density as
and its higher derivatives are equal to zero.
The integrability of all dominants now follows form the finiteness of moments of the logistic distribution.
Turning to regularity conditions B, Condition B1 in the case of MLEs, coincides with Condition A1, while in the case of moment estimators, which we have in analytic form, it follows from the law of large numbers. Condition B2 in the case of MLEs, again coincides with Condition A2, while in the case on moment estimators it follows from the differentiability of the function \(g(\cdot )\) and the finiteness of \(\int _{-\infty }^{\infty }x^2|\frac{\partial ^k g_{\theta }(x)}{\partial \theta ^k}| dx, k=0,1,2,3\). Since the kernel \(\varPhi \) of the test for logistic distribution is bounded, from the Remark above, conditions B3 and B4 follow from A3 and A4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Meintanis, S., Milošević, B. & Obradović, M. Bahadur efficiency for certain goodness-of-fit tests based on the empirical characteristic function. Metrika 86, 723–751 (2023). https://doi.org/10.1007/s00184-022-00891-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-022-00891-0