Computational Statistics

, Volume 33, Issue 2, pp 563–593 | Cite as

Logistic regression diagnostics in ridge regression

  • M. Revan Özkale
  • Stanley Lemeshow
  • Rodney Sturdivant
Original Paper
  • 154 Downloads

Abstract

The adverse effects of multicollinearity and unusual observations are seen in logistic regression and attention had been given in the literature to each of these problems separately. However, multicollinearity and unusual observations can arise simultaneously in logistic regression. The objective of this paper is to propose the statistics for detecting the unusual observations in an ill-conditioned data set under the ridge logistic estimator. A numerical example and two Monte Carlo simulation studies are used to illustrate the methodology. The present investigation shows that ridge logistic estimation copes with unusual observations by downweighting their influence.

Keywords

Logistic regression Regression diagnostics Ridge logistic estimator Multicollinearity 

Notes

Acknowledgements

This research was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) to M. Revan Özkale, a visiting scholar for 5 months by TÜBİTAK to The Ohio State University, College of Public Health, Division of Biostatistics.

References

  1. Atkinson AC (1985) Plots, transformations and regression. Clarendon Press, OxfordMATHGoogle Scholar
  2. Cook RD (1977) Detection of influential observation in linear regression. Technometrics 19(1):15–18MathSciNetMATHGoogle Scholar
  3. Cook D, Weisberg S (1982) Residuals and influence in regression. Chapman and Hall, New YorkMATHGoogle Scholar
  4. Duffy DE, Santner TJ (1989) On the small sample properties of norm restricted maximum likelihood estimators for logistic regression models. Commun Stat Theory Methods 18(3):959–980MathSciNetCrossRefMATHGoogle Scholar
  5. Hoerl AE, Kennard RW, Baldwin KF (1975) Ridge regression: some simulations. Commun Stat Theory Methods 4(2):105–123CrossRefMATHGoogle Scholar
  6. Hosmer DW, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, 3rd edn. Wiley, HobokenCrossRefMATHGoogle Scholar
  7. Jahufer A, Jianbao C (2009) Assessing global influential observations in modified ridge regression. Stat Probab Lett 79(4):513–518MathSciNetCrossRefMATHGoogle Scholar
  8. Jennings DE (1986) Outliers and residual distributions in logistic regression. J Am Stat Assoc 81(396):987–990MathSciNetCrossRefMATHGoogle Scholar
  9. Lee AH, Silvapulle MJ (1988) Ridge estimation in logistic regression. Commun Stat Simul 17(4):1231–1257MathSciNetCrossRefMATHGoogle Scholar
  10. LeCessie S, VanHouwelingen JC (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201CrossRefGoogle Scholar
  11. Lesaffre E, Albert A (1989) Multiple group logistic regression diagnostics. J R Stat Soc Ser C Appl Stat 38:425–440MathSciNetMATHGoogle Scholar
  12. Lesaffre E, Marx BD (1993) Collinearity in generalized linear regression. Commun Stat Theory Methods 22(7):1933–1952MathSciNetCrossRefMATHGoogle Scholar
  13. McDonald GC, Galarneau DI (1975) A Monte Carlo evaluation of some ridge-type estimators. J Am Stat Assoc 70(350):407–416CrossRefMATHGoogle Scholar
  14. McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, Boca RatonCrossRefMATHGoogle Scholar
  15. Özkale MR (2013) Influence measures in affine combination type regression. J Appl Stat 40(10):2219–2243MathSciNetCrossRefGoogle Scholar
  16. Pregibon D (1981) Logistic regression diagnostics. Ann Stat 9(4):705–724MathSciNetCrossRefMATHGoogle Scholar
  17. Preisser JS, Garcia DI (2005) Alternative computational formulae for generalized linear model diagnostics: identifing influential observations with SAS software. Comput Stat Data Anal 48:755–764CrossRefMATHGoogle Scholar
  18. Schaefer RL, Roi LD, Wolfe RA (1984) A ridge logistic estimator. Commun Stat Theory Methods 13(1):99–113CrossRefGoogle Scholar
  19. Schott JR (2005) Matrix analysis for statistics. Wiley, HobokenMATHGoogle Scholar
  20. Smith EP, Marx BD (1990) Ill-conditioned information matrices, generalized linear models and estimation of the effects of acid rain. Environmetrics 1(1):57–71CrossRefGoogle Scholar
  21. Wahba G, Golub GH, Heath M (1979) Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2):215–223MathSciNetCrossRefMATHGoogle Scholar
  22. Walker E, Birch JB (1988) Influence measures in ridge regression. Technometrics 30(2):221–227CrossRefGoogle Scholar
  23. Weissfeld LA, Sereika SM (1991) A multicollinearity diagnostics for generalized linear models. Commun Stat Theory Methods 20(4):1183–1198MathSciNetCrossRefGoogle Scholar
  24. Williams DA (1987) Generalized linear model diagnostics using the deviance and single case deletions. J R Stat Soc Ser C Appl Stat 36(2):181–191MathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • M. Revan Özkale
    • 1
  • Stanley Lemeshow
    • 2
  • Rodney Sturdivant
    • 3
  1. 1.Department of Statistics, Faculty of Science and LettersÇukurova UniversityAdanaTurkey
  2. 2.Division of Biostatistics, College of Public HealthThe Ohio State UniversityColumbusUSA
  3. 3.Department of Mathematics and Physics, College of Liberal Arts and SciencesAzusa Pacific UniversityAzusaUSA

Personalised recommendations