Asymptotic properties of $$M$$ -estimators in linear and nonlinear multivariate regression models

Withers, Christopher S.; Nadarajah, Saralees

doi:10.1007/s00184-013-0458-4

Asymptotic properties of $M$-estimators in linear and nonlinear multivariate regression models

Published: 23 August 2013

Volume 77, pages 647–673, (2014)
Cite this article

Metrika Aims and scope Submit manuscript

Christopher S. Withers¹ &
Saralees Nadarajah²

336 Accesses
3 Citations
Explore all metrics

Abstract

We consider the (possibly nonlinear) regression model in $\mathbb{R }^q$ with shift parameter $\alpha $ in $\mathbb{R }^q$ and other parameters $\beta $ in $\mathbb{R }^p$. Residuals are assumed to be from an unknown distribution function (d.f.). Let $\widehat{\phi }$ be a smooth $M$-estimator of $\phi = {{\beta }\atopwithdelims (){\alpha }}$ and $T(\phi )$ a smooth function. We obtain the asymptotic normality, covariance, bias and skewness of $T(\widehat{\phi })$ and an estimator of $T(\phi )$ with bias $\sim n^{-2}$ requiring $\sim n$ calculations. (In contrast, the jackknife and bootstrap estimators require $\sim n^2$ calculations.) For a linear regression with random covariates of low skewness, if $T(\phi ) = \nu \beta $, then $T(\widehat{\phi })$ has bias $\sim n^{-2}$ (not $n^{-1}$) and skewness $\sim n^{-3}$ (not $n^{-2}$), and the usual approximate one-sided confidence interval (CI) for $T(\phi )$ has error $\sim n^{-1}$ (not $n^{-1/2}$). These results extend to random covariates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bahadur representations of M-estimators and their applications in general linear models

Article Open access 22 May 2018

The asymptotic properties of the estimators in a semiparametric regression model

Article 29 April 2017

Asymptotic normality of DHD estimators in a partially linear model

Article 21 February 2015

References

Andrews DF, Bickel PJ, Hampel FR, Huber PJ, Rogers WH, Tukey JW (1972) Robust estimates of location: survey and advances. Princeton University Press, Princeton
MATH Google Scholar
Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA (1993) Efficient and adaptive estimation for semiparametric models. Johns Hopkins University Press, Baltimore
MATH Google Scholar
Bierens HJ (1981) Robust methods and asymptotic theory in nonlinear econometrics. Springer, New York
Book MATH Google Scholar
Cai T, Duan S, Ren T, Liu F (2010) A robust parametric method for power harmonic estimation based on M-estimators. Measurement 43:67–77
Article Google Scholar
Carroll RJ, Ruppert D (1982) Robust estimation in heteroscedastic linear models. Ann Stat 10:429–441
Article MATH MathSciNet Google Scholar
Chang X-W (2006) Computation of Huber’s M-estimates for a block-angular regression problem. Comput Stat Data Anal 50:5–20
Article MATH Google Scholar
Chao J-C, Douglas SC (2007) A robust complex fast ICA algorithm using the Huber M-estimator cost function. Lect Notes Comput Sci 4666:152–160
Article Google Scholar
Chen J, Li DG, Lin ZY (2011) Asymptotic expansion for nonparametric M-estimator in a nonlinear regression model with long-memory errors. J Stat Plan Inference 141:3035–3046
Article MATH MathSciNet Google Scholar
Deergha Rao K, Raju BVSSN (2006) Improved robust multiuser detection in non-Gaussian channels using a new M-estimator and spatiotemporal chaotic spreading sequences. In: Proceedings of the IEEE Asia Pacific conference on circuits and systems pp 1729–1732
Douglas SC, Chao J-C (2007) Simple, robust, and memory-efficient fast ICA algorithms using the Huber M-estimator cost function. J VLSI Signal Process 48:143–159
Article Google Scholar
El-Yamany NA, Papamichalis PE (2008) Robust color image superresolution: An adaptive M-estimation framework. EURASIP J Image Video Process Article ID 763254
Faurie F, Giremus A (2010) Combining generalized likelihood ratio and M-estimation for the detection/compensation of GPS measurement biases. In: Proceedings of the 2010 IEEE international conference on acoustics, speech, and, signal processing, pp 4178–4181
Fouad MM, Dansereau RM, Whitehead AD (2011) Two-step super-resolution technique using bounded total variation and bisquare M-estimator under local illumination changes. In: Proceedings of the 18th international conference on image processing, pp 1381–1384
Fraiman R (1983) General $M$-estimators and applications to bounded influence estimation for non-linear regression. Commun Stat Theory Methods 12:2617–2631
Article MATH MathSciNet Google Scholar
Hajek J, Sidak Z (1967) Theory of rank tests. Academic Press, New York
MATH Google Scholar
Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New York
MATH Google Scholar
Hassaïne Y, Delourme B, Panciatici P, Walter E (2005) M-Arctan estimator based on the trust-region method. Int J Electr Power Energy Syst 28:590–598
Article Google Scholar
Hoseinnezhad R, Bab-Hadiashar A (2011) An M-estimator for high breakdown robust estimation in computer vision. Comput Vis Image Underst 115:1145–1156
Article Google Scholar
Huber PJ (1981) Robust statistics. Wiley, New York
Book MATH Google Scholar
Huber PJ (1996) Robust statistical procedures, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia
Book Google Scholar
Huber PJ, Ronchetti EM (2009) Robust statistics, 2nd edn. Wiley, New York
Book MATH Google Scholar
Kalyani S, Giridhar K (2007) Mitigation of error propagation in decision directed OFDM channel tracking using generalized M estimators. IEEE Trans Signal Process 55:1659–1672
Article MathSciNet Google Scholar
Katkovnik V (1999) Robust M-estimates of the frequency and amplitude of a complex-valued harmonic. Signal Process 77:71–84
Article MATH Google Scholar
Katkovnik V, Lee M-S, Kim Y-H (2008) Robust M-estimation techniques for non-Gaussian CDMA wireless channels with phased array antenna. Signal Process 88:670–684
Article MATH Google Scholar
Kawarnura K, Hasegawa K, Yamashita O, Sato Y, Ikeuchi K (1999) Object recognition using local EGI and 3D models with M-estimators. In: Proceedings of the international conference on multisensor fusion and integration for intelligent systems, pp 80–86
Koul HL (1992) Weighted empiricals and linear models. Institute of Mathematical Statistics, Hayward
MATH Google Scholar
Koul HL (2002) Weighted empirical processes in dynamic nonlinear models, 2nd edn. Institute of Mathematical Statistics, Hayward
Book MATH Google Scholar
Lee M-J (2010) Micro-econometrics: methods of moments and limited dependent variables, 2nd edn. Springer, New York
Book Google Scholar
Liese F, Miescke K-J (2008) Statistical decision theory: estimation, testing, and selection. Springer, New York
Google Scholar
Marazzi A (1993) Algorithms, routines, and S functions for robust statistics. Wadsworth and Brooks/Cole Advanced Books and Software, Pacific Grove
MATH Google Scholar
Maronna RA, Yohai VJ (1981) Asymptotic behaviour of general $M$-estimates for regression and scale with random carriers. Probab Theory Relat Fields 58:7–20
MATH MathSciNet Google Scholar
Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, Chichester
Book Google Scholar
Mitra S, Mitra A, Kundu D (2011) Genetic algorithm and M-estimator based robust sequential estimation of parameters of nonlinear sinusoidal signals. Commun Nonlinear Sci Numer Simul 16:2796–2809
Article MATH MathSciNet Google Scholar
Nguyen N-V, Shevlyakov G, Shin V (2010) Alternative to M-estimates in multisensor data fusion. World Acad Sci Eng Technol 46:1034–1038
Google Scholar
Park Y, Kim D, Kim S (2012) Robust regression using data partitioning and M-estimation. Commun Stat Simul Comput 41:1282–1300
Google Scholar
Peracchi F (2001) Econometrics. Wiley, Chichester
MATH Google Scholar
Pfanzagl J (1994) Parametric statistical theory. Walter de Gruyter and Company, Berlin
Book MATH Google Scholar
Pham DS, Leung YH, Zoubir A, Brcic R (2004) Sequential M-estimation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 697–700
Powell JL (1984) Least absolute deviations estimation for the censored regression. J Econom 25:303–325
Article MATH Google Scholar
Prakasa Rao BLS (1999) Statistical inference for diffusion type processes. Edward Arnold, London
MATH Google Scholar
Randles RH, Wolfe DA (1979) Introduction to the theory of nonparametric statistics. Wiley, New York
MATH Google Scholar
Rao CR, Toutenburg H (1995) Linear models: least squares and alternatives. Springer, New York
Book MATH Google Scholar
Rey WJJ (1978) Robust statistical methods. Springer, Berlin
MATH Google Scholar
Rieder H (1994) Robust asymptotic statistics. Springer, New York
Book MATH Google Scholar
Serfling RJ (1980) Approximation theorems of mathematical statistics. Wiley, New York
Book MATH Google Scholar
Staudte RG, Sheather SJ (1990) Robust estimation and testing. Wiley, New York
Book MATH Google Scholar
Sutarno D (2008) Constrained robust estimation of magnetotelluric impedance functions based on a bounded-influence regression M-estimator and the Hilbert transform. Nonlinear Process Geophys 15:287–293
Article Google Scholar
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Book MATH Google Scholar
van de Geer SA (2000) Applications of empirical process theory. Cambridge University Press, Cambridge
Venetsanopoulos AN, Zervakis ME (1990) M-estimators in robust nonlinear image restoration. Opt Eng 29:455–470
Article Google Scholar
Verboon P (1994) A robust approach to nonlinear multivariate analysis. D.S.W.O Press, Leiden
MATH Google Scholar
Withers CS (1982a) The distribution and quantiles of a function of parameter estimates. Ann Inst Stat Math A 34:55–68
Article MATH MathSciNet Google Scholar
Withers CS (1982b) Second order inference for asymptotically normal random variables. Sankhyā 44:19–27
MATH MathSciNet Google Scholar
Withers CS (1983) Expansions for the distribution and quantiles of a regular functional of the empirical distribution with applications to nonparametric confidence intervals. Ann Stat 11:577–587
Article MATH MathSciNet Google Scholar
Withers CS (1987) Bias reduction by Taylor series. Commun Stat Theory Methods 16:2369–2384
Article MATH MathSciNet Google Scholar
Withers CS (1989) Accurate confidence intervals when nuisance parameters are present. Commun Stat Theory Methods 18:4229–4259
Article MATH MathSciNet Google Scholar
Withers CS, Nadarajah S (2007) $M$-estimates for stationary and scaled residuals. Random Oper Stoch Equ 15:287–296
Article MATH MathSciNet Google Scholar
Withers CS, Nadarajah S (2011) Expansions for the distribution of M-estimates with applications to the multi-tone problem. ESAIM: Probab Stat 15:139–167
Article MATH MathSciNet Google Scholar
Xiong B, Yin Z (2011) Structural similar patches for nonlocal-means with modified robust M-estimator and residual images. In: Proceedings of the 2011 IEEE international conference on mechatronics and automation, pp 709–714

Download references

Acknowledgments

The authors would like to thank the Editor and the referee for careful reading and for their comments which greatly improved the paper.

Author information

Authors and Affiliations

Applied Mathematics Group, Industrial Research Limited, Lower Hutt, New Zealand
Christopher S. Withers
School of Mathematics, University of Manchester, Manchester , M13 9PL, UK
Saralees Nadarajah

Authors

Christopher S. Withers
View author publications
You can also search for this author in PubMed Google Scholar
Saralees Nadarajah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saralees Nadarajah.

Appendices

Appendix A

Here, we illustrate how to obtain the derivatives $\delta _{i, j, \ldots } (\theta ) = \partial _i \partial _j \cdots \delta (S_R)$ at $S_R = \theta _R$, where $\theta _R = \mathbb E [ S_R ]$ for $S_R$ of (3.11), where $i, j, \ldots $ range over $1, \ldots , $ dimension $(S_R)$. The derivatives of order up to $2r$ are required to obtain $C_r$ of

$$\begin{aligned} \mathbb E \left[ \widehat{\phi } \right] \approx \phi + \sum _{r = 1}^\infty n^{-r} C_r, \end{aligned}$$

or more generally $D_r$ of $\mathbb E [ t (\widehat{\phi }) ] \approx t (\phi ) + \sum _{r = 1}^\infty n^{-r} D_r$. The derivatives of order up to $2r$ are also required to obtain an estimator of $\phi $ or $t (\phi )$ with bias $O (n^{-r-1})$. Theorem 9.1 gives formulae for them up to order four, so allowing bias reduction to $O (n^{-3})$ via (3.18).

Theorem 9.1

For $x, y, \ldots \in S_R$, set $(i_1, \ldots , i_r | x, y, \ldots ) = \partial _x \partial _y \cdots (\delta _{i_1} \cdots \delta _{i_r})$, where $\partial x = \partial / \partial x$ and $(\cdot )_0 = (\cdot )$ at $S_R = \theta _R$. Then $\{ (h | x_1, \ldots , x_r)_0 \} = \delta _{h \cdot i_1, \ldots , i_r} (\theta _R )$ for $r \le 4$ are given by

$$\begin{aligned} \left( h | x \right) _0 = -I \left( x = S_h \right) \end{aligned}$$

(9.1)

and

$$\begin{aligned} \left( h | x_1, \ldots , x_s \right) _0&= -\sum ^{s-1}_{r=1} \sum ^s_{x_1, \ldots , x_s} I \left( x_s = S_{h, i_1, \ldots , i_r} \right) \left. \left( i_1, \ldots , i_r \right| x_1, \ldots , x_{s-1} \right) _0 \nonumber \\&-\sum ^s_{r=2} \left( \mathbb E \left[ S_{h, i_1, \ldots , i_r} \right] \right) \left. \left( i_1, \ldots , i_r \right| x_1, \ldots , x_s \right) _0 \end{aligned}$$

(9.2)

for $s \ge 2$, where $I (A) = 1$ if $A$ is true, $I (A) = 0$ if $A$ is false, and

$$\begin{aligned} \displaystyle \left. \left( i_1, \ldots , i_r \right| x_1, \ldots , x_s \right) _0 = \left\{ \begin{array}{l@{\quad }l} 0, &{} \text{ for } s < r, \\ \displaystyle \sum _{x_1, \ldots , x_r}^{r!} \left. \left( i_1 \right| x_1 \right) _0 \cdots \left. \left( i_r \right| x_r \right) _0, \displaystyle &{} \text{ for } s=r. \end{array} \right. \qquad \end{aligned}$$

(9.3)

Also, in an obvious extension of the notation of (1.6),

$$\begin{aligned}&\left. \left( i_1, i_2 \right| x_1, x_2, x_3 \right) _0 = \sum ^6 \left. \left( i_1 \right| x_1, x_2 \right) _0 \left. \left( i_2 \right| x_3 \right) _0\!, \\&\left. \left( i_1, i_2 \right| x_1, \ldots , x_4 \right) _0 = \sum ^8 \left. \left( i_1 \right| x_1, x_2, x_3 \right) _0 \left. \left( i_2 \right| x_4 \right) _0 + \sum ^6 \left. \left( i_1 \right| x_1, x_2 \right) _0 \left. \left( i_2 \right| x_3, x_4 \right) _0\!,\\&\left. \left( i_1, i_2, i_3 \right| x_1, \ldots , x_4 \right) _0 = \sum ^{12} \left. \left( i_1 \right| x_1, x_2 \right) _0 \left. \left( i_2 \right| x_3 \right) _0 \left. \left( i_3 \right| x_4 \right) _0\!. \end{aligned}$$

Similarly, we can obtain $(i_1, \ldots , i_r | x_1, \ldots , x_s)_0$ from their values for $r=1$.

Proof

Differentiating (3.7) gives

$$\begin{aligned} (h|x) = -\sum _{r=0} \left\{ I \left( x = S_{h, i_1, \ldots , i_r} \right) \left( i_1, \ldots , i_r \right) + S_{h, i_1, \ldots , i_r} \left. \left( i_1, \ldots , i_r \right| x \right) \right\} \!, \end{aligned}$$

where $(i_1, \ldots , i_r) = 1$ for $r=0$, and

$$\begin{aligned} \left. \left( h \right| x_1, \ldots , x_s \right)&= -\sum _{r=1} \Bigg \{ \sum ^s_{x_1, \ldots , x_s} I \left( x_s = S_{h, i_1, \ldots , i_r} \right) \left. \left( i_1, \ldots , i_r \right| x_1, \ldots , x_{s-1} \right) \\&\qquad \qquad + S_{h, i_1, \ldots , i_r} \left. \left( i_1, \ldots , i_r \right| x_1, \ldots , x_s \right) \Bigg \} \end{aligned}$$

for $s \ge 2$, where

$$\begin{aligned} \sum ^s_{x_1, \ldots , x_s} a_{x_1, \ldots , x_{s-1}} b_{x_s} = \sum ^s_{j=1} a_{(x)_j} b_{x_j}, \end{aligned}$$

where $(x)_j = x_1, \ldots , x_s$ with $x_j$ deleted. So, (9.1) and (9.2) follow. Note that (9.3) follows since $\delta (\theta _R) = 0$. $\square $

Appendix B

Theorem 10.1

Suppose $Y_1, \ldots , Y_n$ are i.i.d in $\mathbb{R }^r$ with mean $\mu $ and finite covariance $V$.

(I)
Let $\{ a_{N, n} \} \subset \mathbb{R }^r$ satisfy
$$\begin{aligned} \left( \max _N a'_{N, n} V a_{N, n} \right) / \sigma _n^2 \longrightarrow 0 \end{aligned}$$
(10.1)
as $n \rightarrow \infty $, where $\sigma _n^2 = \sum ^n_{N=1} a'_{N, n} V a_{N, n}$. Then $\sum ^n_{N = 1} a'_{N, n} Y_N$ is asymptotically normal with mean $\lim _{n \rightarrow \infty } \sum _{N = 1}^n a'_{N, n} \mu $ and variance $\lim _{n \rightarrow \infty } \sigma ^2_n$.
(II)
Let $\{ A_{N, n} \} \subset \mathbb{R }^{s \times r}$ satisfy
$$\begin{aligned} \left( \max _N \text{ trace } A_{N, n} V A'_{N, n} \right) / \lambda _n \longrightarrow 0 \end{aligned}$$
(10.2)
as $n \rightarrow \infty $, where $\lambda _n$ is the minimum eigenvalue of $C_n = \sum ^n_{N = 1} A_{N, n} V A'_{N, n}$. Then $\sum ^n_{N = 1} A_{N, n} Y_N$ is asymptotically normal with mean $\lim _{n \rightarrow \infty } \sum ^n_{N = 1} A_{N, n} \mu $ and covariance $\lim _{n \rightarrow \infty } C_n$.

Suppose that $V$ is positive-definite. Then (10.1) holds if

$$\begin{aligned} \max _N \left| a_{N, n} \right| ^2 / \sum ^n_{N = 1} \left| a_{N, n} \right| ^2 \longrightarrow 0, \end{aligned}$$

(10.3)

and (10.2) holds if

$$\begin{aligned} \left( \max _N \text{ trace } A_{N, n} A'_{N, n} \right) / \left( \text{ min. } \text{ eigenvalue } \text{ of } \sum A_{N, n} A'_{N, n} \right) \longrightarrow 0. \qquad \end{aligned}$$

(10.4)

Proof

Suppose the minimum eigenvalue of $V$ is positive and (10.3) holds. Then the proof of (I) follows that given on page 153 of Hajek and Sidak (1967) for the case $r=1$. The result in (II) follows under (10.4) by the Cramer-Wold device. That (I), (II) hold under (10.1), (10.2) follows by writing $Y_j = V^{1/2} X_j$, where $\{ X_N \}$ are i.i.d with covariance $I_r$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Withers, C.S., Nadarajah, S. Asymptotic properties of $M$-estimators in linear and nonlinear multivariate regression models. Metrika 77, 647–673 (2014). https://doi.org/10.1007/s00184-013-0458-4

Download citation

Received: 08 February 2013
Published: 23 August 2013
Issue Date: July 2014
DOI: https://doi.org/10.1007/s00184-013-0458-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymptotic properties of \(M\)-estimators in linear and nonlinear multivariate regression models

Abstract

Access this article

Similar content being viewed by others

Bahadur representations of M-estimators and their applications in general linear models

The asymptotic properties of the estimators in a semiparametric regression model

Asymptotic normality of DHD estimators in a partially linear model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Theorem 9.1

Proof

Appendix B

Theorem 10.1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Asymptotic properties of \(M\)-estimators in linear and nonlinear multivariate regression models

Abstract

Access this article

Similar content being viewed by others

Bahadur representations of M-estimators and their applications in general linear models

The asymptotic properties of the estimators in a semiparametric regression model

Asymptotic normality of DHD estimators in a partially linear model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Theorem 9.1

Proof

Appendix B

Theorem 10.1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation