Algorithmica

, Volume 69, Issue 1, pp 148–183 | Cite as

On the Least Trimmed Squares Estimator

  • David M. Mount
  • Nathan S. Netanyahu
  • Christine D. Piatko
  • Ruth Silverman
  • Angela Y. Wu
Article

Abstract

The linear least trimmed squares (LTS) estimator is a statistical technique for fitting a linear model to a set of points. Given a set of n points in ℝd and given an integer trimming parameter hn, LTS involves computing the (d−1)-dimensional hyperplane that minimizes the sum of the smallest h squared residuals. LTS is a robust estimator with a 50 %-breakdown point, which means that the estimator is insensitive to corruption due to outliers, provided that the outliers constitute less than 50 % of the set. LTS is closely related to the well known LMS estimator, in which the objective is to minimize the median squared residual, and LTA, in which the objective is to minimize the sum of the smallest 50 % absolute residuals. LTS has the advantage of being statistically more efficient than LMS. Unfortunately, the computational complexity of LTS is less understood than LMS. In this paper we present new algorithms, both exact and approximate, for computing the LTS estimator. We also present hardness results for exact and approximate LTS. A number of our results apply to the LTA estimator as well.

Keywords

Robust estimation Linear estimation Least trimmed squares estimator Approximation algorithms Lower bounds 

References

  1. 1.
    Alon, N., Spencer, J.: The Probabilistic Method. Wiley, New York (2000) CrossRefMATHGoogle Scholar
  2. 2.
    Baran, I., Demaine, E.D., Patrascu, M.: Subquadratic algorithms for 3SUM. Algorithmica 50, 584–596 (2008) CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Bernholt, T.: Computing the least median of squares estimator in time O(n d). In: Proc. Intl. Conf. on Computational Science and Its Applications. Springer LNCS, vol. 3480, pp. 697–706. Springer, Berlin (2005) Google Scholar
  4. 4.
    Bremner, D., Chen, D., Iacono, J., Langerman, S., Morin, P.: Output-sensitive algorithms for Tukey depth and related problems. Stat. Comput. 18, 259–266 (2008) CrossRefMathSciNetGoogle Scholar
  5. 5.
    Chan, T.M.: An optimal randomized algorithm for maximum Tukey depth. In: Proc. 15th Annu. ACM-SIAM Sympos. Discrete Algorithms, pp. 430–436 (2004) Google Scholar
  6. 6.
    Coakley, C.W., Hettmansperger, T.P.: A bounded influence, high breakdown, efficient regression estimator. J. Am. Stat. Assoc. 88, 872–880 (1993) CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Conte, S.D., De Boor, C.W.: Elementary Numerical Analysis: an Algorithmic Approach, 3rd edn. McGraw-Hill, New York (1980) MATHGoogle Scholar
  8. 8.
    de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry: Algorithms and Applications, 3rd edn. Springer, Berlin (2008) Google Scholar
  9. 9.
    Edelsbrunner, H.: Algorithms in Combinatorial Geometry. EATCS Monographs on Theoretical Computer Science, vol. 10. Springer, Heidelberg (1987) CrossRefMATHGoogle Scholar
  10. 10.
    Edelsbrunner, H., Guibas, L.J.: Topologically sweeping an arrangement. J. Comput. Syst. Sci. 38, 165–194 (1989). Corrigendum in 42, 249–251 (1991) CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Edelsbrunner, H., Souvaine, D.L.: Computing median-of-squares regression lines and guided topological sweep. J. Am. Stat. Assoc. 85, 115–119 (1990) CrossRefMATHGoogle Scholar
  12. 12.
    Erickson, J.: Lower bounds for linear satisfiability problems. Chic. J. Theor. Comput. Sci. 1999(8) (1999). http://compgeom.cs.uiuc.edu/~jeffe/pubs/linsat.html
  13. 13.
    Erickson, J., Har-Peled, S., Mount, D.M.: On the least median square problem. Discrete Comput. Geom. 36, 593–607 (2006) CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Erickson, J., Seidel, R.: Better lower bounds on detecting affine and spherical degeneracies. Discrete Comput. Geom. 13, 41–57 (1995) CrossRefMATHMathSciNetGoogle Scholar
  15. 15.
    Da Fonseca, G.D.: Fitting flats to points with outliers. Unpublished manuscript (2010) Google Scholar
  16. 16.
    Gajentaan, A., Overmars, M.H.: On a class of O(n 2) problems in computational geometry. Comput. Geom. Theory Appl. 5, 165–185 (1995) CrossRefMATHMathSciNetGoogle Scholar
  17. 17.
    Har-Peled, S.: How to get close to the median shape. Comput. Geom. Theory Appl. 36, 39–51 (2007) CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Hawkins, D.M.: The feasible solution algorithm for least trimmed squares regression. Comput. Stat. Data Anal. 17, 185–196 (1994) CrossRefMATHMathSciNetGoogle Scholar
  19. 19.
    Herstein, I.N.: Topics in Algebra, 2nd edn. Wiley, New York (1975) MATHGoogle Scholar
  20. 20.
    Hössjer, O.: Exact computation of the least trimmed squares estimate in simple linear regression. Comput. Stat. Data Anal. 19, 265–282 (1995) CrossRefGoogle Scholar
  21. 21.
    Langerman, S., Steiger, W.: The complexity of hyperplane depth in the plane. Discrete Comput. Geom. 30, 299–309 (2003) CrossRefMATHMathSciNetGoogle Scholar
  22. 22.
    Matoušek, J.: Lectures on Discrete Geometry. Springer, Berlin (2002) CrossRefMATHGoogle Scholar
  23. 23.
    Miller, K., Ramaswami, S., Rousseeuw, P., Sellares, T., Souvaine, D., Streinu, I., Struyf, A.: Efficient computation of location depth contours by methods of combinatorial geometry. Stat. Comput. 13, 153–162 (2003) CrossRefMathSciNetGoogle Scholar
  24. 24.
    Mount, D.M., Netanyahu, N.S., Piatko, C., Silverman, R., Wu, A.Y.: Quantile approximation for robust statistical estimation and k-enclosing problems. Int. J. Comput. Geom. Appl. 10, 593–608 (2000) CrossRefMATHMathSciNetGoogle Scholar
  25. 25.
    Mount, D.M., Netanyahu, N.S., Piatko, C., Silverman, R., Wu, A.Y.: A practical approximation algorithm for the LTS estimator. Technical report CS-TR-5000, University of Maryland (2011) Google Scholar
  26. 26.
    Mount, D.M., Netanyahu, N.S., Romanik, K.R., Silverman, R., Yu, A.Y.: A practical approximation algorithm for the LMS line estimator. Comput. Stat. Data Anal. 51, 2461–2486 (2007) CrossRefMATHGoogle Scholar
  27. 27.
    Mount, D.M., Netanyahu, N.S., Zuck, E.: Analyzing the number of samples required for an approximate Monte-Carlo LMS line estimator. In: Hubert, M., Pison, G., Struyf, A., Van Aelst, S. (eds.) Theory and Applications of Recent Robust Methods. Statistics for Industry and Technology, pp. 207–219. Birkhäuser, Basel (2004) CrossRefGoogle Scholar
  28. 28.
    Rousseeuw, P.J.: Least median-of-squares regression. J. Am. Stat. Assoc. 79, 871–880 (1984) CrossRefMATHMathSciNetGoogle Scholar
  29. 29.
    Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987) CrossRefMATHGoogle Scholar
  30. 30.
    Rousseeuw, P.J., van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Discov. 12, 29–45 (2006) CrossRefMathSciNetGoogle Scholar
  31. 31.
    Simpson, D.G., Ruppert, D., Carroll, R.J.: On one-step GM estimates and stability of influences in linear regression. J. Am. Stat. Assoc. 87, 439–450 (1992) CrossRefMATHMathSciNetGoogle Scholar
  32. 32.
    Souvaine, D.L., Steele, J.M.: Time- and space-efficient algorithms for least median of squares regression. J. Am. Stat. Assoc. 82, 794–801 (1987) CrossRefMATHMathSciNetGoogle Scholar
  33. 33.
    Yohai, V.J.: High breakdown-point and high efficiency robust estimates for regression. Ann. Stat. 15, 642–656 (1987) CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • David M. Mount
    • 1
  • Nathan S. Netanyahu
    • 2
    • 3
  • Christine D. Piatko
    • 4
  • Ruth Silverman
    • 3
  • Angela Y. Wu
    • 5
  1. 1.Department of Computer ScienceUniversity of MarylandCollege ParkUSA
  2. 2.Department of Computer ScienceBar-Ilan UniversityRamat-GanIsrael
  3. 3.Center for Automation ResearchUniversity of MarylandCollege ParkUSA
  4. 4.The Johns Hopkins University Applied Physics LaboratoryLaurelUSA
  5. 5.Department of Computer ScienceAmerican UniversityWashingtonUSA

Personalised recommendations