Advertisement

Computational Statistics

, Volume 33, Issue 2, pp 563–593 | Cite as

Logistic regression diagnostics in ridge regression

  • M. Revan Özkale
  • Stanley Lemeshow
  • Rodney Sturdivant
Original Paper

Abstract

The adverse effects of multicollinearity and unusual observations are seen in logistic regression and attention had been given in the literature to each of these problems separately. However, multicollinearity and unusual observations can arise simultaneously in logistic regression. The objective of this paper is to propose the statistics for detecting the unusual observations in an ill-conditioned data set under the ridge logistic estimator. A numerical example and two Monte Carlo simulation studies are used to illustrate the methodology. The present investigation shows that ridge logistic estimation copes with unusual observations by downweighting their influence.

Keywords

Logistic regression Regression diagnostics Ridge logistic estimator Multicollinearity 

Notes

Acknowledgements

This research was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) to M. Revan Özkale, a visiting scholar for 5 months by TÜBİTAK to The Ohio State University, College of Public Health, Division of Biostatistics.

References

  1. Atkinson AC (1985) Plots, transformations and regression. Clarendon Press, OxfordzbMATHGoogle Scholar
  2. Cook RD (1977) Detection of influential observation in linear regression. Technometrics 19(1):15–18MathSciNetzbMATHGoogle Scholar
  3. Cook D, Weisberg S (1982) Residuals and influence in regression. Chapman and Hall, New YorkzbMATHGoogle Scholar
  4. Duffy DE, Santner TJ (1989) On the small sample properties of norm restricted maximum likelihood estimators for logistic regression models. Commun Stat Theory Methods 18(3):959–980MathSciNetCrossRefzbMATHGoogle Scholar
  5. Hoerl AE, Kennard RW, Baldwin KF (1975) Ridge regression: some simulations. Commun Stat Theory Methods 4(2):105–123CrossRefzbMATHGoogle Scholar
  6. Hosmer DW, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, 3rd edn. Wiley, HobokenCrossRefzbMATHGoogle Scholar
  7. Jahufer A, Jianbao C (2009) Assessing global influential observations in modified ridge regression. Stat Probab Lett 79(4):513–518MathSciNetCrossRefzbMATHGoogle Scholar
  8. Jennings DE (1986) Outliers and residual distributions in logistic regression. J Am Stat Assoc 81(396):987–990MathSciNetCrossRefzbMATHGoogle Scholar
  9. Lee AH, Silvapulle MJ (1988) Ridge estimation in logistic regression. Commun Stat Simul 17(4):1231–1257MathSciNetCrossRefzbMATHGoogle Scholar
  10. LeCessie S, VanHouwelingen JC (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201CrossRefGoogle Scholar
  11. Lesaffre E, Albert A (1989) Multiple group logistic regression diagnostics. J R Stat Soc Ser C Appl Stat 38:425–440MathSciNetzbMATHGoogle Scholar
  12. Lesaffre E, Marx BD (1993) Collinearity in generalized linear regression. Commun Stat Theory Methods 22(7):1933–1952MathSciNetCrossRefzbMATHGoogle Scholar
  13. McDonald GC, Galarneau DI (1975) A Monte Carlo evaluation of some ridge-type estimators. J Am Stat Assoc 70(350):407–416CrossRefzbMATHGoogle Scholar
  14. McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, Boca RatonCrossRefzbMATHGoogle Scholar
  15. Özkale MR (2013) Influence measures in affine combination type regression. J Appl Stat 40(10):2219–2243MathSciNetCrossRefGoogle Scholar
  16. Pregibon D (1981) Logistic regression diagnostics. Ann Stat 9(4):705–724MathSciNetCrossRefzbMATHGoogle Scholar
  17. Preisser JS, Garcia DI (2005) Alternative computational formulae for generalized linear model diagnostics: identifing influential observations with SAS software. Comput Stat Data Anal 48:755–764CrossRefzbMATHGoogle Scholar
  18. Schaefer RL, Roi LD, Wolfe RA (1984) A ridge logistic estimator. Commun Stat Theory Methods 13(1):99–113CrossRefGoogle Scholar
  19. Schott JR (2005) Matrix analysis for statistics. Wiley, HobokenzbMATHGoogle Scholar
  20. Smith EP, Marx BD (1990) Ill-conditioned information matrices, generalized linear models and estimation of the effects of acid rain. Environmetrics 1(1):57–71CrossRefGoogle Scholar
  21. Wahba G, Golub GH, Heath M (1979) Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2):215–223MathSciNetCrossRefzbMATHGoogle Scholar
  22. Walker E, Birch JB (1988) Influence measures in ridge regression. Technometrics 30(2):221–227CrossRefGoogle Scholar
  23. Weissfeld LA, Sereika SM (1991) A multicollinearity diagnostics for generalized linear models. Commun Stat Theory Methods 20(4):1183–1198MathSciNetCrossRefGoogle Scholar
  24. Williams DA (1987) Generalized linear model diagnostics using the deviance and single case deletions. J R Stat Soc Ser C Appl Stat 36(2):181–191MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • M. Revan Özkale
    • 1
  • Stanley Lemeshow
    • 2
  • Rodney Sturdivant
    • 3
  1. 1.Department of Statistics, Faculty of Science and LettersÇukurova UniversityAdanaTurkey
  2. 2.Division of Biostatistics, College of Public HealthThe Ohio State UniversityColumbusUSA
  3. 3.Department of Mathematics and Physics, College of Liberal Arts and SciencesAzusa Pacific UniversityAzusaUSA

Personalised recommendations