Abstract
Let \(F_{\theta }\) be a family of distributions with support on the set of nonnegative integers \(Z_{0}\). In this paper we derive the M-estimators with smallest gross error sensitivity (GES). We start by defining the uniform median of a distribution F with support on \(Z_{0}\) (umed(F)) as the median of \(x+u,\) where x and u are independent variables with distributions F and uniform in [-0.5,0.5] respectively. Under some general conditions we prove that the estimator with smallest GES satisfies umed\((F_{n})=\)umed\((F_{\theta }),\) where \(F_{n}\) is the empirical distribution. The asymptotic distribution of these estimators is found. This distribution is normal except when there is a positive integer k so that \(F_{\theta }(k)=0.5.\) In this last case, the asymptotic distribution behaves as normal at each side of 0, but with different variances. A simulation Monte Carlo study compares, for the Poisson distribution, the efficiency and robustness for finite sample sizes of this estimator with those of other robust estimators.
Similar content being viewed by others
References
Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96:1022–1030
Hampel FR (1974) The influence curve and its role in robust estimation. J Am Stat Assoc 69:383–394
Hassler U (2018) Note on sample quantiles for ordinal data. Stat Pap. https://doi.org/10.1007/s00362-018-1054-5
Ma Y, Genton M, Parzen E (2011) Asymptotic properties of sample quantiles of discrete distributions. Ann Inst Stat Math 63:227–243
Maronna RA, Martin RD, Yohai VJ, Salibián-Barrera M (2019) Robust statistics: theory and methods (with R), 2nd edn. Wiley, Chichester
Valdora M, Yohai VJ (1974) Robust estimators for generalized linear models. J Stat Plan Inference 146:31–48
Acknowledgements
This research was partially supported by Grants 20020130100279BA and 20020170100330BA from the Universidad de Buenos Aires, Argentina and by grant PICT 201-0377 from ANPCYT,
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was partially supported by Grants X-094 and 20020170100022BA from Universidad de Buenos Aires, PID 5505 from CONICET and PAV 120 and PICT 21407 from ANPCYT, Argentina.
Appendix: Proofs of results
Appendix: Proofs of results
1.1 Proof of Theorem 1
For the purposes of this proof it will be more convenient to state Hampel’s problem in its equivalent dual form, namely, to minimize the GES under a bound K on the asymptotic variance. It is known that the solution is again given by (9), where now m is a decreasing function of the bound K. For given m call \(\widehat{\theta }_{m}\) the Hampel-optimal estimator given by (9). In this case (10) takes on the form
When the bound K tends to infinity, \(m\rightarrow 0\) and the GES of \(\widehat{\theta }_{m}\) tends to its lower bound. Then to prove the Theorem it is enough to show that there exists \(m_{0}\) such that for \(m\le m_{0},\) this estimator coincides with the estimator given by (11).
We will suppose that \(\psi _{0}(k,\theta )\) is strictly increasing in k. The proof when it is strictly decreasing is similar. Put \(k_{0}^{*}(\theta )=k_{0}(F_{\theta })\) and let
It will be shown that if \(m\le m_{0}\) then
Suppose that \(c(m,\theta )<\psi _{0}(k_{0}^{*}(\theta ),\theta )-m_{0}.\) Then we have \(\psi _{0}(k,\theta )-c(m,\theta )>m_{0}\) for all \(k\ge k_{0}^{*}(\theta ),\) and hence
We also have
which implies
contradicting ( 19 ), Similarly it can be proved that we can not have \(c(m,\theta )>\psi _{0}(k_{0}^{*}(\theta ),\theta )-m_{0}.\)
Then from (21) and (20) we get that \(\psi _{m}^{H}(\psi _{0}(k,\theta )-c(m,\theta ))\le -m\) for \(k<k_{0}^{*}(\theta )\) and \(\psi _{m}^{H}(\psi _{0}(k,\theta )-c(m,\theta ))\ge m\) for \(k>k_{0}^{*}(\theta ).\) Then it follows from (19) that
or similarly
From (23) we derive
Define
and note that according to (24) for \(m\le m_{0},\) \(G_{m}\) does not depend on m. Then (24) and (4) imply that for \(m\le m_{0}\)
If \(m\le m_{0}\) , (9) is equivalent to
and using the same arguments that lead to (25), we can prove that (26) is equivalent to
Consider the estimator \(\widehat{\mathbf {\theta }}\) defined by umed \((\widehat{\theta })=\)umed\((F_{n}).\) By (7) we have \(k_{0}(F_{n})=\) \(k_{0}^{*}(\widehat{\theta })\) and by (25) we get
Then (27) holds and this implies that (9) holds too. This proves the Theorem.
1.2 Proof of Lemma1
Let \(X_{n}\sim F^{\left( n\right) }\) and \(X\sim F,\) and call \(G^{\left( n\right) }\) and G the distributions of \(X_{n}+u\) and \(X+u,\) respectively, where u has a uniform distribution on \([-0.5,0.5]\) independent of \(X_{n}\) or of X. Then \(G^{(n)}\) and G have a positive density, and \(G^{(n)}\rightarrow _{w}G\). Since umed\((F^{\left( n\right) })=\mathrm {med}\left( G^{\left( n\right) }\right) \) and umed\((F)=\mathrm {med}\left( G\right) ,\) and the median is a weakly continuous functional, the result is shown.
1.3 Proof of Theorem 3
Recall the notation defined in (12) and (13).
(a) If \(F(K)>0.5,\) then then for large n we have \(k_{0}(F_{n})=K,\) and therefore
To derive the asymptotic distribution of \(Z_{n}\) we need first to calculate that of the vector
The vector \(d_{n}\) converges in distribution to a bivariate normal distribution with mean (0, 0) and covariance matrix
where
Then since for large n
the delta method yields that \(Z_{n}\rightarrow ^{D}N(0,\sigma ^{2})\) where
and a straightforward calculation yields (15).
(b) If \(F\left( K\right) =0.5\) it follows from (6) that \(\mathrm {umed}(F)=K+0.5.\) On the other hand, for large n we have
We are going to calculate \(Z_{n}\) in both cases. If \(F_{n}(K)\ge 0.5\) we have
and therefore
If \(F_{n}(K)<0.5\) it follows in the same way that
Note that, conversely, \(Z_{n}\le 0\) implies \(F_{n}(K)\ge 0.5\) and \(Z_{n}>0\) implies \(F_{n}(K)<0.5.\) Since \(n^{1/2}(0.5-F_{n}(K))\rightarrow _{d}N(0,0.25),\) the Central Limit Theorem and Slutsky’s Lemma yield (16).
1.4 Proof of Theorem 4
Recall that \(\widehat{\theta }_{n}=g^{-1}\left( {\text {umed}}(F_{n})\right) \) and \(\theta =g^{-1}\left( {\text {umed}}(F_{\theta })\right) ,\) Put for brevity \(K=k_{0}\left( F_{\theta }\right) .\)
(a) If \(F_{\theta }\left( K\right) >0.5\) there exists an interval I containing \(\theta \) such that \(t\in I\) implies that \(F_{t}\left( K\right) >0.5\) and \(F_{t}\left( K-1\right) <0.5,\) and therefore \(k_{0}\left( F_{t}\right) =K.\) Therefore g is differentiable at \(\theta ,\) and the result follows from Theorem 2, part (a), Theorem 3, and Slutsky’s Lemma.
(b) Assume now \(F_{\theta }\left( K\right) =0.5.\) Then \(t<\theta \) implies that \(F_{t}\left( K\right) >0.5,\) and therefore for sufficiently small \(\delta \) we have \(k_{0}\left( F_{\theta -\delta }\right) =K\) and \(k_{0}\left( F_{\theta +\delta }\right) =K+1.\) Then the left- and right side derivatives of g at \(\theta \) are \(g_{-}^{\prime }\) and \(g_{+}^{\prime }\) given by (18), and therefore the left- and right side derivatives of \(g^{-1}\) are \(1/g_{-}^{\prime }\) and \(g_{+}^{\prime },\) respectively.
We have
Note that for \(\tau <0\) we have \(g\left( t\right) -g\left( \theta \right) =\left( t-\theta \right) g_{-}^{\prime }+o\left( t-\theta \right) ,\) and that
with H defined in (16). The result follows by applying the delta method.
Rights and permissions
About this article
Cite this article
Maronna, R.A., Yohai, V.J. Optimal robust estimators for families of distributions on the integers. Stat Papers 62, 2269–2281 (2021). https://doi.org/10.1007/s00362-020-01187-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01187-z