Skip to main content
Log in

Approximating symmetrized estimators of scatter via balanced incomplete U-statistics

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

We derive limiting distributions of symmetrized estimators of scatter. Instead of considering all \(n(n-1)/2\) pairs of the n observations, we only use nd suitably chosen pairs, where \(d \ge 1\) is substantially smaller than n. It turns out that the resulting estimators are asymptotically equivalent to the original one whenever \(d = d(n) \rightarrow \infty\) at arbitrarily slow speed. We also investigate the asymptotic properties for arbitrary fixed d. These considerations and numerical examples indicate that for practical purposes, moderate fixed values of d between 10 and 20 yield already estimators which are computationally feasible and rather close to the original ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Barbour, A. D., Chen, L. H. Y. (eds.) (2005). An introduction to Stein’s method, Lecture Notes Series, Vol. 4, Institute for Mathematical Sciences, National University of Singapore, Singapore University Press.

    Book  Google Scholar 

  • Blom, G. (1976). Some properties of incomplete \(U\)-statistics. Biometrika, 63, 573–580.

    Article  MathSciNet  Google Scholar 

  • Brown, B. M., Kildea, D. G. (1978). Reduced \(U\)-statistics and the Hodges–Lehmann estimator. The Annals of Statistics, 6, 828–835.

    Article  MathSciNet  Google Scholar 

  • Dudley, R. M., Sidenko, S., Wang, Z. (2009). Differentiability of \(t\)-functionals of location and scatter. The Annals of Statistics, 37, 939–960.

    Article  MathSciNet  Google Scholar 

  • Dümbgen, L. (1998). On Tyler’s \(M\)-functional of scatter in high dimension. Annals of the Institute of Statistical Mathematics, 50, 471–491.

    Article  MathSciNet  Google Scholar 

  • Dümbgen, L., Nordhausen, K., Schuhmacher, H. (2014). fastM: Fast computation of multivariate M-estimators. R package, https://cran.r-project.org/web/packages/fastM

  • Dümbgen, L., Pauly, M., Schweizer, T. (2015). M-functionals of multivariate scatter. Statistics Surveys, 9, 32–105.

    Article  MathSciNet  Google Scholar 

  • Dümbgen, L., Nordhausen, K., Schuhmacher, H. (2016). New algorithms for M-estimation of multivariate scatter and location. Journal of Multivariate Analysis, 144, 200–217.

    Article  MathSciNet  Google Scholar 

  • Feller, W. (1945). The fundamental limit theorems in probability. Bulletin of the American Mathematical Society, 51, 800–832.

    Article  MathSciNet  Google Scholar 

  • Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. The Annals of Mathematical Statistics, 19, 293–325.

    Article  MathSciNet  Google Scholar 

  • Hoeffding, W. (1951). A combinatorial Central Limit Theorem. The Annals of Mathematical Statistics, 22, 558–566.

    Article  MathSciNet  Google Scholar 

  • Kent, J. T., Tyler, D. E. (1991). Redescending \(M\)-estimates of multivariate location and scatter. The Annals of Mathematical Statistics, 19, 2102–2119.

    MathSciNet  Google Scholar 

  • Lee, A. J. (1990). U-statistics—theory and practice (Vol. 110). New York: Marcel Dekker, Inc.

    Google Scholar 

  • Miettinen, J., Nordhausen, K., Taskinen, S., Tyler, D. E. (2016). On the computation of symmetrized \(M\)-estimators of scatter. In C. Agostinelli, A. Basu, P. Filzmoser, D. Mukherjee (Eds.), Recent Advances in Robust Statistics: Theory and Applications (pp. 151–167). India: Springer.

    Chapter  Google Scholar 

  • Muandet, K., Fukumizu, K., Sriperumbudur, B., Schölkopf, B. (2017). Kernel mean embedding of distributions: A review and beyond. Foundations and Trends in Machine Learning, 10, 1–141.

    Article  Google Scholar 

  • Nordhausen, K., Tyler, D. E. (2015). A cautionary note on robust covariance plug-in methods. Biometrika, 102, 573–588.

    Article  MathSciNet  Google Scholar 

  • Nordhausen, K., Oja, H., Ollila, E. (2008). Robust independent component analysis based on two scatter matrices. Austrian Journal of Statistics, 37, 91–100.

    Google Scholar 

  • Paindaveine, D. (2008). A canonical definition of shape. Statistics and Probability Letters, 78, 2240–2247.

    Article  MathSciNet  Google Scholar 

  • Serfling, R. J. (1980). Approximation theorems of mathematical statistics. Wiley series in probability and mathematical statistics, New York: Wiley.

    Google Scholar 

  • Sirkia, S., Taskinen, S., Oja, H. (2007). Symmetrised M-estimators of multivariate scatter. Journal of Multivariate Analysis, 98, 1611–1629.

    Article  MathSciNet  Google Scholar 

  • Stein, C. (1986). Approximate computation of expectations, Institute of mathematical statistics lecture notes—monograph series, Vol. 7, Hayward, CA: Institute of Mathematical Statistics.

    Book  Google Scholar 

  • Tyler, D. E. (1987). A distribution-free \(M\)-estimator of multivariate scatter. The Annals of Statistics, 15, 234–251.

    Article  MathSciNet  Google Scholar 

  • Tyler, D. E., Critchley, F., Dümbgen, L., Oja, H. (2009). Invariant coordinate selection (with discussion). Journal of the Royal Statistical Society, Series B: Statistical Methodology, 71, 549–592.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank Sara Taskinen for stimulating discussions. Constructive comments of three referees are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lutz Dümbgen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

L. Dümbgen, work supported by the Swiss National Science Foundation.

A auxiliary results

A auxiliary results

1.1 A.1 A particular coupling of random permutations

Preparations. For an integer \(n \ge 1\), let \({\mathcal {S}}_n\) be the set of all permutations of \(\langle n\rangle := \{1,2,\ldots ,n\}\). A cycle in \({\mathcal {S}}_n\) is a permutation \(\sigma \in {\mathcal {S}}_n\) such that for \(m \ge 1\) pairwise different points \(a_1,\ldots ,a_m \in \langle n\rangle\),

$$\begin{aligned} a_1 \ \mapsto \ a_2 \mapsto \ \cdots \ \mapsto \ a_m \ \mapsto \ a_1, \end{aligned}$$

while \(\sigma (i) = i\) for \(i \in \langle n\rangle {\setminus } \{a_1,\ldots ,a_m\}\). (In the case of \(m = 1\), \(\sigma (i) = i\) for all \(i \in \langle n\rangle\).) We write

$$\begin{aligned} \sigma \ = \ (a_1,\ldots ,a_m)_{\textrm{c}} \end{aligned}$$

for this mapping and note that it has m equivalent representations

$$\begin{aligned} \sigma \ = \ (a_1,\ldots ,a_m)_{\textrm{c}} \ = \ (a_2,\ldots ,a_m,a_1)_{\textrm{c}} \ = \ \cdots \ = \ (a_m,a_1,\ldots ,a_{m-1})_{\textrm{c}}. \end{aligned}$$

Any permutation \(\sigma \in {\mathcal {S}}_n\) can be written as

$$\begin{aligned} \sigma \ = \ (a_{11},\ldots ,a_{1m(1)})_{\textrm{c}} \circ \cdots \circ (a_{k1},\ldots ,a_{km(k)})_{\textrm{c}}, \end{aligned}$$

where the sets \(\{a_{j1},\ldots ,a_{jm(j)}\}\), \(1 \le j \le k\), form a partition of \(\langle n\rangle\). Note that the cycles \((a_{j1},\ldots ,a_{jm(j)})_{\textrm{c}}\), \(1\le j\le m\), commute. This representation of \(\sigma\) as a combination of cycles is unique if we require, for instance, that

$$\begin{aligned} a_{jm(j)} \ = \ \min \{a_{j1},\ldots ,a_{jm(j)}\} \quad \text {for} \ 1 \le j \le k \end{aligned}$$

and

$$\begin{aligned} a_{1m(1)}< \cdots < a_{km(k)}. \end{aligned}$$

In what follows, let \({\mathcal {S}}_n^*\) be the set of all permutations \(\sigma \in {\mathcal {S}}_n\) consisting of just one cycle, i.e.,

$$\begin{aligned} \sigma \ = \ (a_1,a_2,\ldots ,a_n)_{\textrm{c}} \end{aligned}$$

with pairwise different numbers \(a_1, a_2, \ldots , a_n \in \langle n\rangle\).

The coupling. The standardized cycle representation of \(\sigma \in {\mathcal {S}}_n\) gives rise to a particular mapping \({\mathcal {S}}_n \ni \pi \mapsto (\sigma ,\sigma ^*) \in {\mathcal {S}}_n \times {\mathcal {S}}_n^*\) such that \(\pi \mapsto \sigma\) is bijective. For fixed \(\pi \in {\mathcal {S}}_n\) and any index \(i \in \langle n\rangle\), let

$$\begin{aligned} M_i \,= \ \langle n\rangle \setminus \{\pi (s): 1 \le s < i\}, \end{aligned}$$

i.e., \(\langle n\rangle = M_1 \supset M_2 \supset \cdots \supset M_n = \{\pi (n)\}\), and \(\# M(i) = n+1-i\). Let \(1 \le t_1< t_2< \cdots < t_k = n\) be those indices i such that \(\pi (i) = \min (M_i)\). Then,

$$\begin{aligned} \sigma \,= \ \left( \pi (1), \ldots , \pi (t_1) \right) _{\textrm{c}} \circ \left( \pi (t_1+1),\ldots ,\pi (t_2) \right) _{\textrm{c}} \circ \cdots \circ \left( \pi (t_{k-1}+1), \ldots , \pi (t_k) \right) _{\textrm{c}} \end{aligned}$$

defines a permutation of \(\langle n\rangle\) with standardized cycle representation. This is essentially the construction used by Feller (1945) to investigate the number of cycles of a random permuation. Moreover,

$$\begin{aligned} \sigma ^* \,= \ \left( \pi (1), \pi (2), \ldots , \pi (n) \right) _{\textrm{c}} \end{aligned}$$

defines a permutation in \({\mathcal {S}}_n^*\) such that

$$\begin{aligned} \left\{ i \in \langle n\rangle : \sigma (i) \ne \sigma ^*(i) \right\} \ = \ {\left\{ \begin{array}{ll} \emptyset &{} \text {if} \ k = 1, \\ \{t_1,\ldots ,t_k\} &{} \text {if} \ k \ge 2. \end{array}\right. } \end{aligned}$$

Suppose that \(\pi\) is a random permutation with uniform distribution on \({\mathcal {S}}_n\). Then, \(\sigma\) is a random permutation with uniform distribution on \({\mathcal {S}}_n\) too, because \(\pi \mapsto \sigma\) is a bijection. Since the conditional distribution of \(\pi (i)\), given \((\pi (s))_{1 \le s < i}\), is the uniform distribution on \(M_i\), the random variables

$$\begin{aligned} Y_i \,= \ 1_{[\pi (i) = \min (M_i)]}, \quad i \in \langle n\rangle , \end{aligned}$$

are stochastically independent Bernoulli random variables with \(\textrm{I}\!\textrm{P}(Y_i = 1) = (n+1-i)^{-1} = 1 - \textrm{I}\!\textrm{P}(Y_i = 0)\). Consequently,

$$\begin{aligned} \textrm{I}\!\textrm{E}\left( \# \left\{ i \in \langle n\rangle : \sigma (i) \ne \sigma ^*(i) \right\} \right) \ \le \ \sum _{i=1}^n (n+1-i)^{-1} \ = \ 1 + \sum _{j=2}^n j^{-1} \ \le \ 1 + \log (n), \end{aligned}$$

because \(j^{-1} \le \int _{j-1}^j x^{-1} \, dx = \log (j) - \log (j-1)\) for \(2 \le j \le n\).

1.2 A.2 Some inequalities related to Lindeberg-type conditions

In connection with Gaussian approximations and Stein’s method, see Stein (1986) or Barbour and Chen (2005), the quantity

$$\begin{aligned} L(X) \,= \ \textrm{I}\!\textrm{E}\left( X^2 \min (|X|,1) \right) \end{aligned}$$

for a square-integrable random variable X plays an important role. Elementary considerations show that

$$\begin{aligned} h(x) \ \le \ x^2 \min (|x|,1) \ \le \ \sqrt{2} \, h(x) \quad \text {with}\quad h(x) \,= \ \frac{|x|^3}{\sqrt{1 + x^2}} \end{aligned}$$

for arbitrary \(x \in {\mathbb {R}}\). Moreover, \(h: {\mathbb {R}}\rightarrow [0,\infty )\) is an even, convex function such that \(h(2x) \le 8 h(x)\). Consequently, for arbitrary \(x,y \in {\mathbb {R}}\), Jensen’s inequality implies that

$$\begin{aligned} (x + y)^2 \min (|x + y|, 1)&\le \ \sqrt{2} \, \textrm{I}\!\textrm{E}h(x + y) \\&\le \ 2^{-1/2} \left( h(2x) + h(2y) \right) \\&\le \ \sqrt{32} \, \textrm{I}\!\textrm{E}h(x) + \sqrt{32} \, \textrm{I}\!\textrm{E}h(y) \\&\le \ 6 x^2 \min (|x|, 1) + 6 y^2 \min (|y|,1) \ \le \ 6 x^2 \min (|x|,1) + 6 y^2 . \end{aligned}$$

For a symmetric matrix \(A \in {\mathbb {R}}^{n\times n}\), we define its row means \(\bar{A}_i:= n^{-1} \sum _{j=1}^n A_{ij}\) and its overall mean \(\bar{A}:= n^{-2} \sum _{i,j=1}^n A_{ij}\). Let \(\tilde{A}:= (A_{ij} - \bar{A}_i - \bar{A}_j + \bar{A})_{i,j=1}^n\). Then, elementary calculations and the previous inequalities reveal that

$$\begin{aligned} 0 \ \le \ n^{-1} \sum _{i,j=1}^n A_{ij}^2 - n^{-1} \sum _{i,j=1}^n \tilde{A}_{ij}^2 \ \le \ 2 \sum _{i=1}^n \bar{A}_i^2 \end{aligned}$$

and

$$\begin{aligned} n^{-1} \sum _{i,j=1}^n \tilde{A}_{ij}^2 \min (|\tilde{A}_{ij}|,1) \ \le \ 6 n^{-1} \sum _{i,j=1}^n A_{ij}^2 \min (|A_{ij}|,1) + 12 \sum _{i=1}^n \bar{A}_i^2. \end{aligned}$$

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dümbgen, L., Nordhausen, K. Approximating symmetrized estimators of scatter via balanced incomplete U-statistics. Ann Inst Stat Math 76, 185–207 (2024). https://doi.org/10.1007/s10463-023-00879-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-023-00879-1

Keywords

Navigation