Skip to main content
Log in

Multinomial logit models with implicit variable selection

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

The multinomial logit model is the most widely used model for the unordered multi-category responses. However, applications are typically restricted to the use of few predictors because in the high-dimensional case maximum likelihood estimates frequently do not exist. In this paper we are developing a boosting technique called multinomBoost that performs variable selection and fits the multinomial logit model also when predictors are high-dimensional. Since in multi-category models the effect of one predictor variable is represented by several parameters one has to distinguish between variable selection and parameter selection. A special feature of the approach is that, in contrast to existing approaches, it selects variables not parameters. The method can also distinguish between mandatory predictors and optional predictors. Moreover, it adapts to metric, binary, nominal and ordinal predictors. Regularization within the algorithm allows to include nominal and ordinal variables which have many categories. In the case of ordinal predictors the order information is used. The performance of boosting technique with respect to mean squared error, prediction error and the identification of relevant variables is investigated in a simulation study. The method is applied to the national Indonesia contraceptive prevalence survey and the identification of glass. Results are also compared with the Lasso approach which selects parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Blake C, Keogh E, Merz CJ (1998) UCI repository of machine learning databases

  • Bühlmann P (2006) Boosting for high-dimensional linear models. Ann Stat 34:559–583

    Article  MATH  Google Scholar 

  • Bühlmann P, Hothorn T (2007) Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat Sci 22:477–505

    Article  MATH  Google Scholar 

  • Bühlmann P, Yu B (2003) Boosting with l2 loss: regression and classification. J Am Stat Assoc 98:324–339

    Article  MATH  Google Scholar 

  • Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. Machine Learning: Proceedings of the 13th international conference, pp 148–156

  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22

    Google Scholar 

  • Friedman JH, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28:337–407

    Article  MathSciNet  MATH  Google Scholar 

  • Gertheiss J, Tutz G (2009) Penalized regression with ordinal predictors. Int Stat Rev 77:345–365

    Article  Google Scholar 

  • Krishnapuram B, Carin L, Figueiredo MA, Hartemink AJ (2005) Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans Pattern Anal Mach Intell 27:957–968

    Article  Google Scholar 

  • Lim TS, Loh WY, Shih YS (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–229

    Article  MATH  Google Scholar 

  • Meier L, van de Geer S, Bühlmann P (2008) The group lasso for logistic regression. J R Stat Soc 70:53–71

    Article  MATH  Google Scholar 

  • Nyquist H (1991) Restricted estimation of generalized linear models. J Appl Stat 40:133–141

    Article  MATH  Google Scholar 

  • Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc B 69:659–677

    Article  MathSciNet  Google Scholar 

  • Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227

    Google Scholar 

  • Segerstedt B (1992) On ordinary ridge regression in generalized linear models. Commun Stat Theory Methods 21:2227–2246

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via lasso. J R Stat Soc B 58:267–288

    MathSciNet  MATH  Google Scholar 

  • Tutz G, Binder H (2006) Generalized additive modelling with implicit variable selection by likelihood based boosting. Biometrics 62:961–971

    Article  MathSciNet  MATH  Google Scholar 

  • Tutz G, Binder H (2007) Boosting ridge regression. Comput Stat Data Anal 51:6044–6059

    Article  MathSciNet  MATH  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68:49–67

    Article  MathSciNet  MATH  Google Scholar 

  • Zahid FM, Tutz G (2013) Ridge estimation for multinomial logit models with symmetric side constraints. Comput Stat 28(3): 1017–1034

    Google Scholar 

  • Zhao P, Rocha G, Yu B (2009) The composite absolute penalties family for grouped and hierarchical variable selection. Ann Stat 37:3468–3497

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Faisal Maqbool Zahid.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zahid, F.M., Tutz, G. Multinomial logit models with implicit variable selection. Adv Data Anal Classif 7, 393–416 (2013). https://doi.org/10.1007/s11634-013-0136-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-013-0136-4

Keywords

Mathematics Subject Classification

Navigation