Advertisement

High Breakdown Point Estimators in Logistic Regression

  • Andreas Christmann
Part of the Lecture Notes in Statistics book series (LNS, volume 109)

Abstract

Estimators with high finite sample breakdown points are of special interest in robust statistics. However, in contrast to estimation in linear regression models the breakdown point approach have not yet received much attention in logistic regression models. Although various robust estimators have been proposed in logistic regression models, their breakdown points are often not yet known. Here it is shown for logistic regression models with binary data that there is no estimator with a high finite sample breakdown point, provided the estimator has to fulfill a weak condition. However, in logistic regression models with large strata modifications of Rousseeuw’s least median of squares estimator and least trimmed squares estimator have finite sample breakdown points of approximately 1/2. Both estimators are strongly consistent under a large supermodel of the logistic regression model. Existing programs can be used to compute such estimates.

Key words and phrases

High breakdown point least median of squares least trimmed squares least median of weighted squares least trimmed weighted squares logistic regression outliers overdispersion robust regression 

AMS 1991 subject classifications

62F35 62F10 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Albert, A. and Anderson, J.A. (1984): On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71 1–10.MathSciNetCrossRefzbMATHGoogle Scholar
  2. [2]
    Bedrick, E.J. and Hill, J.R. (1990): Outlier tests for logistic regression: A conditional approach. Biometrika 77 815–827.CrossRefGoogle Scholar
  3. [3]
    Berkson, J. (1944): Application of the logistic function to bio-assay. J. Amer. Statist. Assoc. 39 357–365.CrossRefGoogle Scholar
  4. [4]
    Christmann, A. (1993): Strong consistency of the least median of weighted squares estimator for large strata. Research paper 93/12. Univ. of Dortmund, Dept. of Statistics, submitted.Google Scholar
  5. [5]
    Christmann, A. (1994a): Least median of weighted squares in logistic regression with large strata. Biometrika 81 413–417.MathSciNetCrossRefzbMATHGoogle Scholar
  6. [6]
    Christmann, A. (1994b): High Breakdown Point Estimation for Certain Regression Models. Research paper 94/1. Univ. of Dortmund, Dept. of Statistics. Submitted.Google Scholar
  7. [7]
    Copas, J.B. (1988): Binary regression models for contaminated data. With discussion. J. Roy. Statist. Soc. B 50 225–265.MathSciNetGoogle Scholar
  8. [8]
    Cox, C. (1987): Threshold dose-response models in toxicology. Biometrics 43 511–523.CrossRefGoogle Scholar
  9. [9]
    Cox, D.R. and Snell, E.J. (1989): The Analysis of Binary Data. 2nd ed. Chapman & Hall, London.Google Scholar
  10. [10]
    Croux, C., Rousseeuw, P.J. and Hössjer, O. (1993): Generalized S-estimators. Report no. 93-03, Department of Mathematics and Computer Science, Univ. of Antwerp.Google Scholar
  11. [11]
    Davies, P.L. (1990): The asymptotics of S-estimators in the linear regression model. Ann. Statist. 18 1651–1675.MathSciNetCrossRefzbMATHGoogle Scholar
  12. [12]
    Davies, L. (1994): Desirable properties, breakdown and efficiency in the linear model. Statist. Probab. Lett. 19 361–370.MathSciNetCrossRefzbMATHGoogle Scholar
  13. [13]
    Davies, L. and Gather, U. (1993): The identification of multiple outliers. (With discussion.) J. Amer. Statist. Assoc. 88 782–801.MathSciNetCrossRefzbMATHGoogle Scholar
  14. [14]
    Davis, L.J. (1985): Consistency and asymptotic normality of the minimum logit chi-squared estimator when the number of design points is large. Ann. Statist. 13 947–957.MathSciNetCrossRefzbMATHGoogle Scholar
  15. [15]
    Donoho, D.L. and Huber, P.J. (1983): The notion of breakdown point. In A Festschrift for Erich L. Lehmann (P.J. Bickel, K.A. Docksum, and J.L. Hodges, Jr., eds.), 157–184. Wadsworth, Belmont, CA.Google Scholar
  16. [16]
    Fahrmeir, L. and Kaufmann, H. (1985): Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann. Statist. 13 342–368.MathSciNetCrossRefzbMATHGoogle Scholar
  17. [17]
    Hampel, F.R., Rousseeuw, P.J., Ronchetti, E.M. and Stahel, W.A. (1986): Robust Statistics — The Approach Based on Influence Functions. Wiley, New York.zbMATHGoogle Scholar
  18. [18]
    Huber, P.J. (1981): Robust Statistics. Wiley, New York.CrossRefzbMATHGoogle Scholar
  19. [19]
    Künsch, H.R., Stefanski, L.A. and Carroll, R.J. (1989): Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J. Amer. Statist. Assoc. 84 460–466.MathSciNetCrossRefzbMATHGoogle Scholar
  20. [20]
    Liang, K.-Y. and McCullagh, P. (1993): Case studies in binary dispersion. Biometrics 49 623–630.CrossRefGoogle Scholar
  21. [21]
    Marazzi, A. (1993): Algorithms, Routines, and S Functions for Robust Statistics. Wadsworth, Belmont, CA.zbMATHGoogle Scholar
  22. [22]
    McCullagh, P. and Neider, J.A. (1989): Generalized Linear Models. 2nd ed. Chapman & Hall, London.zbMATHGoogle Scholar
  23. [23]
    Morgenthaler, S. (1992): Least-absolute-deviations fits for generalized linear models. Biometrika 79 747–754.CrossRefzbMATHGoogle Scholar
  24. [24]
    Pregibon, D. (1982): Resistant fits for some commonly used logistic models with medical applications. Biometrics 38 485–498.CrossRefGoogle Scholar
  25. [25]
    Rousseeuw, P.J. (1984): Least median of squares regression. J. Amer. Statist. Assoc. 79 871–880.MathSciNetCrossRefzbMATHGoogle Scholar
  26. [26]
    Rousseeuw, P.J. and Leroy, A.M. (1987): Robust Regression and Outlier Detection. Wiley, New York.CrossRefzbMATHGoogle Scholar
  27. [27]
    Rousseeuw, P.J. and Yohai, V. (1984): Robust regression by means of S-estimators. In Robust and Nonlinear Time Series Analysis, (J. Franke, W. Hardie, and R. D. Martin, eds.), Lecture Notes in Statistics No. 26, 256–272. Springer-Verlag, New York.Google Scholar
  28. [28]
    Santner, T.J. and Duffy, D.E. (1986): A note on A. Albert and J.A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 73 755–758.MathSciNetCrossRefzbMATHGoogle Scholar
  29. [29]
    Stefanski, L.A., Carroll, R.J. and Ruppert, D. (1986): Optimally bounded score functions for generalized linear models with applications to logistic regression. Biometrika 73 413–424.MathSciNetzbMATHGoogle Scholar
  30. [30]
    Thompson, W.A. and Funderlic, R.E. (1981): A simple threshold model for the classical bioassay problem. In Measurement of Risks (G.G. Berg and H.D. Maillie, eds.), 511–533. Plenum, New York.Google Scholar
  31. [31]
    Vidmar, T.J., McKean, J.W. and Hettmansperger, T.P. (1992): Robust procedures for drug combination problems with quantal responses. Appl. Statist. 41 299–315.CrossRefzbMATHGoogle Scholar
  32. [32]
    Yohai, V.J. (1987): High breakdown-point and high efficiency robust estimates for regression. Ann. Statist. 15 642–656.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag New York, Inc. 1996

Authors and Affiliations

  • Andreas Christmann
    • 1
  1. 1.University of HamburgGermany

Personalised recommendations