Least Absolute Deviations pp 37-76 | Cite as

# LAD in Linear Regression

Chapter

## Abstract

Let Z = (X,Y) ∈ R where a ∈ R

^{k+1}be a random vector whose components obey the linear model$$ \begin{gathered} Y = {a_1}{X_1} + \cdots + {a_k}{x_k} + U \hfill \\ = < \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{a} ,\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{X} > + U \hfill \\ \end{gathered} $$

(1)

^{k}and the random variable U, E(U) = m,*are*given. If X and ∪*are*independent E(U|X) = E(U) almost surely, and$$ E(\underline Y \underline X ) =<\underline a ,\underline X >+ m\quad a.s. $$

(2)

## Keywords

Asymptotic Normality Influence Function Unique Minimizer Strong Consistency Breakdown Point
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## Preview

Unable to display preview. Download preview PDF.

## Notes

- 1.The requirement in Lemma 2.3 is that for any linear subspace S⊂R
^{k}, P(X∈S) = 0, the inclusion being proper. It is easy to see that this condition is necessary for uniqueness of the minimizer of g.Google Scholar - 2.Theorem 2.1 is not the first result describing the strong consistency of LAD. Amemiya(1979) proves it for independent and identically distributed samples when U has infinite mean. The present proof is similar to that of Gross and Steiger (1979) from the context of time series and allows the independence assumption to be relaxed. Notice that i.i.d. samples are not required.Google Scholar
- 3.The assumptions of Theorem 2 do not imply Bassett-Koenker. Even though (x
_{i},y_{i}) stationary and ergodic implies that the design sample covariance matrix (Z’Z)/n is positive definite for n large, the non-independence of the y_{i}— <c,x_{i}> requires new central limit theory. The idea of the proof is similar to, but simpler than, that of Amemiya (1979) which seems to have some mistakes. Ruppert and Carroll (1977) have taken a related approach for the location problem.Google Scholar - 4.LAD is consistent for infinite variance regressions where least squares certainly is not. Suppose U,V are i.i.d. and integrable and write Z = U + V, X = U. Then X has a linear regression on Z with slope a = 1/2 because \( \begin{gathered} E(X\left| Z \right.) = E\left( {U\left| {U + V} \right.} \right) \hfill \\ = E(V\left| {U + V} \right.) \hfill \\ = E(U + V\left| U \right. + V)/2 \hfill \\ = Z/2 \hfill \\ \end{gathered} \) However if U and V
*are*taken to be symmetric stable random variables of index α < 2, the least squares estimator ĉ_{n}= Σx_{i}z_{i}/Σz_{i}^{2}based on an i.i.d. sample (z_{i},x_{i}), converges in distribution to S/(S + T) S,T being independent positive stable random variable of index α/2 [see Kanter and Steiger (1974)] so least squares is not consistent. A result in Kanter and Steiger (1977) shows that if α > 1, the LAD estimator â_{n}→ 1/2 in probability. Perhaps the most convenient source for material on stable laws and their domains of attraction is Feller (1971).Google Scholar - 5.Jaeckel (1972) does not point out that choosing the scores as in (4.6) yields the LAD estimator nor do Bassett and Koenker (1978) acknowledge that asymptotic normality of LAD can follow from Jaeckel (1972) for unconstrained regressions. Furthermore, although Hogg (1979) discusses R-estimation for regressions using the sign-scores, he does not mention any connection with LAD regression. Hence it is reasonable to suppose that Lemma 4.1 is new. It was mentioned by M. Osborne in a seminar in Canberra in 1980.Google Scholar
- 6.Monte-Carlo experiments apparently demonstrate the inconsistency of least squares when X
_{2}and U both have infinite variance. On the other hand, Kanter and Steiger (1974) have shown that the least squares estimator is consistent if Y = aX + U, X and U being i.i.d. random variables attracted to a stable law of index*α*∈ (0,2), hence having infinite variance. This is a different linear model from that described in Note 4.Google Scholar - 7.Strong consistency of least squares seems more difficult to establish than for LAD [see Lai, Robbins, and Wei (1978), Lai and Robbins (1980), and Nelson (1980)]. The last two also allow non i.i.d. samples. In contrast asymptotic normality seems easier prove for least squares than for LAD. We do not know whether this is a fact or an artifact and can offer no explanation.Google Scholar
- 8.The influence of data points on least squares regression estimates has been discussed by Belsley, Kuh, and Welsch (1980) and Cook and Weisberg (1980,982), among others.Google Scholar
- 9.The influence function for M-estimators of regression coefficients is described in Krasker and Welsch (1982).Google Scholar

## Copyright information

© Birkhäuser Boston, Inc. 1983