What Do We Choose When We Err? Model Selection and Testing for Misspecified Logistic Regression Revisited

  • Jan Mielniczuk
  • Paweł Teisseyre
Part of the Studies in Computational Intelligence book series (SCI, volume 605)


The problem of fitting logistic regression to binary model allowing for missppecification of the response function is reconsidered. We introduce two-stage procedure which consists first in ordering predictors with respect to deviances of the models with the predictor in question omitted and then choosing the minimizer of Generalized Information Criterion in the resulting nested family of models. This allows for large number of potential predictors to be considered in contrast to an exhaustive method. We prove that the procedure consistently chooses model \(t^{*}\) which is the closest in the averaged Kullback-Leibler sense to the true binary model t. We then consider interplay between t and \(t^{*}\) and prove that for monotone response function when there is genuine dependence of response on predictors, \(t^{*}\) is necessarily nonempty. This implies consistency of a deviance test of significance under misspecification. For a class of distributions of predictors, including normal family, Rudd’s result asserts that \(t^{*}=t\). Numerical experiments reveal that for normally distributed predictors probability of correct selection and power of deviance test depend monotonically on Rudd’s proportionality constant \(\eta \).


Incorrect model specification Variable selection Logistic regression 


  1. 1.
    Bache K, Lichman M (2013) UCI machine learning repository. University of California, IrvineGoogle Scholar
  2. 2.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, New YorkGoogle Scholar
  3. 3.
    Bogdan M, Doerge R, Ghosh J (2004) Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting quantitative trait loci. Genetics 167:989–999Google Scholar
  4. 4.
    Bozdogan H (1987) Model selection and Akaike’s information criterion (AIC): the general theory and its analitycal extensions. Psychometrika 52:345–370Google Scholar
  5. 5.
    Burnham K, Anderson D (2002) Model selection and multimodel inference. A practical information-theoretic approach. Springer, New YorkGoogle Scholar
  6. 6.
    Carroll R, Pederson S (1993) On robustness in the logistic regression model. J R Stat Soc B 55:693–706MathSciNetMATHGoogle Scholar
  7. 7.
    Casella G, Giron J, Martinez M, Moreno E (2009) Consistency of Bayes procedures for variable selection. Ann Stat 37:1207–1228Google Scholar
  8. 8.
    Chen J, Chen Z (2008) Extended Bayesian Information Criteria for model selection with large model spaces. Biometrika 95:759–771Google Scholar
  9. 9.
    Chen J, Chen Z (2012) Extended BIC for small-n-large-p sparse glm. Statistica Sinica 22:555–574Google Scholar
  10. 10.
    Claeskens G, Hjort N (2008) Model selection and model averaging. Cambridge University Press, CambridgeCrossRefMATHGoogle Scholar
  11. 11.
    Czado C, Santner T (1992) The effect of link misspecification on binary regression inference. J Stat Plann Infer 33:213–231MathSciNetCrossRefGoogle Scholar
  12. 12.
    Fahrmeir L (1987) Asymptotic testing theory for generalized linear models. Statistics 1:65–76MathSciNetCrossRefGoogle Scholar
  13. 13.
    Fahrmeir L (1990) Maximum likelihood estimation in misspecified generalized linear models. Statistics 4:487–502MathSciNetCrossRefGoogle Scholar
  14. 14.
    Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Stat Assoc 96:1348–1360MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Foster D, George E (1994) The risk inflation criterion for multiple regression. Ann Stat 22:1947–1975Google Scholar
  16. 16.
    Hjort N, Pollard D (1993) Asymptotics for minimisers of convex processes. Unpublished manuscriptGoogle Scholar
  17. 17.
    Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer, New YorkGoogle Scholar
  18. 18.
    Lehmann E (1959) Testing statistical hypotheses. Wiley, New YorkMATHGoogle Scholar
  19. 19.
    Li K, Duan N (1991) Slicing regression: a link-free regression method. Ann Stat 19(2):505–530MathSciNetCrossRefGoogle Scholar
  20. 20.
    Qian G, Field C (2002) Law of iterated logarithm and consistent model selection criterion in logistic regression. Stat Probab Lett 56:101–112MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Ruud P (1983) Sufficient conditions for the consistency of maximum likelihood estimation despite misspecification of distribution in multinomial discrete choice models. Econometrica 51(1):225–228Google Scholar
  22. 22.
    Sin C, White H (1996) Information criteria for selecting possibly misspecified parametric models. J Econometrics 71:207–225MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Zak-Szatkowska M, Bogdan M (2011) Modified versions of Baysian Information Criterion for sparse generalized linear models. Comput Stat Data Anal 5:2908–2924Google Scholar
  24. 24.
    Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67(2):301–320MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Faculty of Mathematics and Information ScienceWarsaw University of TechnologyWarsawPoland
  2. 2.Institute of Computer Science, Polish Academy of SciencesWarsawPoland

Personalised recommendations