The New Palgrave Dictionary of Economics

2018 Edition
| Editors: Macmillan Publishers Ltd

Outliers

  • William S. Krasker
Reference work entry
DOI: https://doi.org/10.1057/978-1-349-95189-5_1884

Abstract

Nearly all empirical investigations in economics, particularly those involving linear structural models or regressions, are subject to the problem of anomalous data, commonly called outliers. Roughly speaking, there are three sources of outliers. First, the distribution of the model’s random disturbances often has longer tails than the normal distribution, resulting in a greatly increased chance of larger disturbances. Second, the data set may contain erroneous numbers, or ‘gross errors’. The data bases most prone to gross errors are large cross sections, particularly those compiled from surveys; gross errors can result from misinterpreted questions, incorrectly recorded answers, keypunch errors, etc. Third, the model itself, typically linear in (transformations of) the variables, is only an approximation to reality. It is apt to be a poor representation of the process generating the data for extreme values of the explanatory variables. This source of outliers applies even to, say, macroeconomic time series, where the likelihood of gross errors is minimal.

This is a preview of subscription content, log in to check access.

Bibliography

  1. Andrews, D.F., P.J. Bickel, F.R. Hampel, P.J. Huber, W.H. Rogers, and J.W. Tukey. 1972. Robust estimates of location: Survey and advances. Princeton: Princeton University Press.Google Scholar
  2. Belsley, D.A., E. Kuh, and R.E. Welsch. 1980. Regression diagnostics. New York: Wiley.CrossRefGoogle Scholar
  3. Donoho, D.L., and P.J. Huber. 1983. The notion of breakdown point. In A Festschrift for Erich L. Lehmann, ed. P. Bickel, K. Doksum, and J.L. Hodges Jr.. Belmont: Wadsworth International Group.Google Scholar
  4. Hampel, F.R. 1968. Contributions to the theory of robust estimation. PhD thesis, University of California, Berkeley.Google Scholar
  5. Hampel, F.R. 1971. A general qualitative definition of robustness. Annals of Mathematical Statistics 42: 1887–1896.CrossRefGoogle Scholar
  6. Huber, P.J. 1964. Robust estimation of a location parameter. Annals of Mathematical Statistics 35(1): 73–101.CrossRefGoogle Scholar
  7. Krasker, W.S. 1980. Estimation in linear regression models with disparate data points. Econometrica 48: 1333–1346.CrossRefGoogle Scholar
  8. Krasker, W.S., and R.E. Welsch. 1985a. Efficient bounded-influence regression estimation. Journal of the American Statistical Association 77(379): 595–604.CrossRefGoogle Scholar
  9. Krasker, W.S., and R.E. Welsch. 1985b. Resistant estimation for simultaneous-equations models using weighted instrumental variables. Econometrica 53(6): 1475–1488.CrossRefGoogle Scholar
  10. Krasker, W.S., E. Kuh, and R.E. Welsch. 1983. Estimation for dirty data and flawed models. In Handbook of econometrics, vol. 1, ed. Z. Griliches and M.D. Intriligator. Amsterdam: North-Holland.Google Scholar
  11. de Laplace, P.S. 1818. Deuxième supplèment à la théorie analytique des probabilités. Paris: Courcier. Reprinted in Oeuvres de Laplace, vol. 7, 569–623. Paris: Imprimerie Royale, 1847. Repinted in Oeuvres complètes de Laplace, vol. 7, 531–580. Paris: Gauthier-Villars, 1886.Google Scholar
  12. Legendre, A.M. 1805. On the method of least squares. Trans. in A source book in mathematics, ed. D.E. Smith. New York: Dover Publications, 1959.Google Scholar
  13. Newcomb, S. 1886. A generalized theory of the combination of observations so as to obtain the best result. American Journal of Mathematics 8: 343–366.CrossRefGoogle Scholar
  14. Rousseeuw, P.J. 1984. Least median of squares regression. Journal of the American Statistical Association 79(388): 871–880.CrossRefGoogle Scholar
  15. Siegel, A.F. 1982. Robust regression using repeated medians. Biometrika 69: 242–244.CrossRefGoogle Scholar
  16. Stigler, S.M. 1973. Simon Newcomb. Percy Daniell, and the history of robust estimation, 1885–1920. Journal of the American Statistical Association 68(344): 872–879.Google Scholar
  17. Taylor, L.D. 1974. Estimation by minimizing the sum of absolute errors. In Frontiers of econometrics, ed. P. Zarembka. New York: Academic Press.Google Scholar
  18. Tukey, J.W. 1960. A survey of sampling from contaminated distributions. In Contributions to probability and statistics, ed. I. Olkin. Stanford: Stanford University Press.Google Scholar

Copyright information

© Macmillan Publishers Ltd. 2018

Authors and Affiliations

  • William S. Krasker
    • 1
  1. 1.