Robust Regression with a Categorical Covariable

  • Mia Huber
  • Peter J. Rousseeuw
Part of the Lecture Notes in Statistics book series (LNS, volume 109)


A fast algorithm is presented for robust estimation of a linear model with a distributed intercept. This is a regression model in which the data set contains groups with the same slopes but different intercepts, a situation which often occurs in economics. In each group, the algorithm first looks for outliers in (x,y) -space by means of a robust projection method. Then a modified version of the resampling technique is applied to the whole data set, in order to find an approximation to least median of squares or other regression methods with a positive breakdown point. Because of the preliminary projections, the number of subsets may be drastically reduced. Simulations and examples show that the overall computation time is substantially lower than that of the straightforward algorithm. The method is illustrated with a real data set.

Key words and phrases

Algorithms computation time distributed intercept outlier detection positive-breakdown methods 

AMS subject classifications

62F35 62J05 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Armstrong, R.D. and Frome, E.L. (1977): A special purpose linear programming algorithm for obtaining least absolute value estimators in a linear model with dummy variables. Communications in Statistics: Simulation and Computation B 6 383–398.CrossRefGoogle Scholar
  2. [2]
    Chatterjee, S. and Price, B. (1977): Regression Analysis by Example. John Wiley, New York.zbMATHGoogle Scholar
  3. [3]
    Donoho, D.L. and Gasko, M. (1992): Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Statist. 20 1803–1827.MathSciNetCrossRefzbMATHGoogle Scholar
  4. [4]
    Draper, N.R. and Smith, H. (1981): Applied Regression Analysis. John Wiley, New York.zbMATHGoogle Scholar
  5. [5]
    Edwards, A.L. (1985): Multiple Regression and the Analysis of Variance and Covariance. W.H. Freeman and Company, New York.zbMATHGoogle Scholar
  6. [6]
    Mincer, J. (1974): Schooling, Experience and Earnings. Columbia University Press, New York.Google Scholar
  7. [7]
    Montgomery, D.C. (1991): Design and Analysis of Experiments. John Wiley, New York.Google Scholar
  8. [8]
    Montgomery, D.C. and Peck, A.E. (1982): Introduction to Linear Regression Analysis. John Wiley, New York.zbMATHGoogle Scholar
  9. [9]
    Rousseeuw, P.J. (1984): Least median of squares regression. J. Amer. Statist. Assoc. 79 871–880.MathSciNetCrossRefzbMATHGoogle Scholar
  10. [10]
    Rousseeuw, P.J. and Leroy, A.M. (1987): Robust Regression and Outlier Detection. John Wiley, New York.CrossRefzbMATHGoogle Scholar
  11. [11]
    Rousseeuw, P.J. and Wagner, J. (1994): Robust regression with a distributed intercept using least median of squares. Comput. Statist. & Data Analysis 17 65–76.CrossRefzbMATHGoogle Scholar
  12. [12]
    Rousseeuw, P.J. and van Zomeren, B.C. (1990): Unmasking multivariate outliers and leverage points. J. Amer. Statist. Assoc. 85 633–639.CrossRefGoogle Scholar
  13. [13]
    Rousseeuw, P.J. and van Zomeren, B.C. (1992): A comparison of some quick algorithms for robust regression. Comput. Statist. & Data Analysis 15 107–116.CrossRefGoogle Scholar
  14. [14]
    Wagner, J. (1990): Sektorlohndifferentiale in der Bundesrepublik Deutschland: Empirische Befunde und ökonometrische Untersuchungen zu theoretischen Erklärungen. Discussion Paper No. 154, Fachbereich Wirtschaftswissenschaften, Universität Hannover.Google Scholar

Copyright information

© Springer-Verlag New York, Inc. 1996

Authors and Affiliations

  • Mia Huber
    • 1
  • Peter J. Rousseeuw
    • 1
  1. 1.University of AntwerpBelgium

Personalised recommendations