Journal of Classification

, Volume 27, Issue 1, pp 89–110 | Cite as

Parsimonious Classification Via Generalized Linear Mixed Models

Article

Abstract

We devise a classification algorithm based on generalized linear mixed model (GLMM) technology. The algorithm incorporates spline smoothing, additive model-type structures and model selection. For reasons of speed we employ the Laplace approximation, rather than Monte Carlo methods. Tests on real and simulated data show the algorithm to have good classification performance. Moreover, the resulting classifiers are generally interpretable and parsimonious.

Keywords

Akaike Information Criterion Feature selection Generalized additive models Penalized splines Supervised learning Model selection Rao statistics Variance components 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BOYD, S., and VANDENBERGHE, L. (2004), Convex Optimization, New York: Cambridge University Press.MATHGoogle Scholar
  2. BREIMAN, L. (2001), “Statistical Modeling: The Two Cultures (With Discussion)”, Statistical Science, 16, 199–231.MATHCrossRefMathSciNetGoogle Scholar
  3. BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984), Classification and Regression Trees, Belmont, California: Wadsworth Publishing.MATHGoogle Scholar
  4. BRESLOW, N.E., and CLAYTON, D.G. (1993), “Approximate Inference in Generalized Linear Mixed Models”, Journal of the American Statistical Association, 88, 9–25.MATHCrossRefGoogle Scholar
  5. BUJA, A., HASTIE, T., and TIBSHIRANI, R. (1989), “Linear Smoothers and Additive Models”, The Annals of Statistics, 17, 453–510.MATHCrossRefMathSciNetGoogle Scholar
  6. CHAMBERS, J. M., and HASTIE, T. J. (1992), Statistical Models in S, New York: Chapman and Hall.MATHGoogle Scholar
  7. COX, D., and KOH, E. (1989), “A Smoothing Spline Based Test of Model Adequacy in Polynomial Regression”, Annals of the Institute of Statistical Mathematics, 41, 383–400.MATHCrossRefMathSciNetGoogle Scholar
  8. DURBÁN, M., and CURRIE, I. (2003), “A Note on P-Spline Additive Models with Correlated Errors”, Computational Statistics, 18, 263–292.MathSciNetGoogle Scholar
  9. GRAY, R. J. (1994), “Spline-based Tests in Survival Analysis”, Biometrics, 50, 640–652.MATHCrossRefMathSciNetGoogle Scholar
  10. GUYON, I., and ELISSEEFF, A. (2003), “An Introduction to Variable and Feature Selection”, Journal of Machine Learning Research, 3, 1157–1182.MATHCrossRefGoogle Scholar
  11. HAND, D.J. (2006), “Classifier Technology and the Illusion of Progress (With Discussion)”, Statistical Science, 21, 1–34.MATHCrossRefMathSciNetGoogle Scholar
  12. HASTIE, T. (2006), “Gam 0.97, R Package”, http://cran.r-project.org.
  13. HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning, New York: Springer-Verlag. MATHGoogle Scholar
  14. HASTIE, T.J., and TIBSHIRANI,R.J. (1990), Generalized AdditiveModels, London: Chapman and Hall.Google Scholar
  15. IMHOF, J.P. (1961), “Computing the Distribution of Quadratic Forms in Normal Variables”, Biometrika, 48, 419–426.MATHMathSciNetGoogle Scholar
  16. KAUERMANN, G., KRIVOBOKOVA, T., and FAHRMEIR, L. (2009), “Some Asymptotic Results on Generalized Penalized Spline Smoothing”, Journal of the Royal Statistical Society, Series B, 71, 487–503.CrossRefGoogle Scholar
  17. KOOPERBERG, C., BOSE, S., and STONE, C.J. (1997), “Polychotomous Regression.”, Journal of the American Statistical Association, 92, 117–127.MATHCrossRefGoogle Scholar
  18. LIN, X. (1997), “Variance Component Testing in Generalised Linear Models with Random Effects”, Biometrika, 84, 309–326.MATHCrossRefMathSciNetGoogle Scholar
  19. MCCULLOCH, C.E., and SEARLE, S.R. (2000), Generalized, Linear, and Mixed Models, New York: John Wiley and Sons.CrossRefGoogle Scholar
  20. ORMEROD, J.T. (2008), “On Semiparametric Regression and Data Mining”, PhD Thesis, School of Mathematics and Statistics, The University of New South Wales, Sydney, Australia.Google Scholar
  21. RAO, C.R. (1973), Linear Statistical Inference and Its Applications, New York: JohnWiley and Sons.MATHCrossRefGoogle Scholar
  22. RUPPERT, D., WAND, M. P., and CARROLL, R.J. (2003), Semiparametric Regression, New York: Cambridge University Press.MATHGoogle Scholar
  23. STONE, C. J., HANSEN, M. H., KOOPERBERG, C. ,and TRUONG, Y. K. (1997), “Polynomial Splines and Their Tensor Products in Extended Linear Modeling”, The Annals of Statistics, 25, 1371–1425.MATHCrossRefMathSciNetGoogle Scholar
  24. VAIDA, F., and BLANCHARD, S. (2005), “Conditional Akaike Information for Mixedeffect Models”, Biometrika, 92, 351–370.MATHCrossRefMathSciNetGoogle Scholar
  25. VERBEKE, G., and MOLENBERGHS, G. (2000), Linear Mixed Models for Longitudinal Data, New York: Springer-Verlag.MATHGoogle Scholar
  26. WAGER, C., VAIDA, F., and KAUERMANN, G. (2007), “Model Selection for P-Spline Smoothing Using Akaike Information Criteria”, Australian and New Zealand Journal of Statistics, 49, 173–190.MATHCrossRefMathSciNetGoogle Scholar
  27. WAKEFIELD, J.C., BEST, N.G., and WALLER, L. (2000), “Bayesian Approaches to Disease Mapping”, in Spatial Epidemiology, eds. P. Elliott, J.C. Wakefield, N.G. Best, and D.J. Briggs, Oxford: Oxford University Press, pp. 104–127. Google Scholar
  28. WAND, M.P. (2002), “Vector Differential Calculus in Statistics”, The American Statistician, 56, 55–62.CrossRefMathSciNetGoogle Scholar
  29. WAND, M. P. (2003), “Smoothing and Mixed Models”, Computational Statistics, 18, 223–249.MATHGoogle Scholar
  30. WAND, M.P. (2007), “Fisher Information for Generalised Linear Mixed Models”, Journal of Multivariate Analysis, 98, 1412–1416.MATHCrossRefMathSciNetGoogle Scholar
  31. WAND, M.P., and Ormerod, J.T. (2008), “On Semiparametric Regression with O’Sullivan Penalised Splines”, Australian and New Zealand Journal of Statistics, 50, 179–198.MATHCrossRefMathSciNetGoogle Scholar
  32. WELHAM, S.J., CULLIS, B.R., KENWARD, M.G., and THOMPSON, R. (2007), “A Comparison ofMixedModel Splines for Curve Fitting”, Australian and New Zealand Journal of Statistics, 49, 1–23.MATHCrossRefMathSciNetGoogle Scholar
  33. WOOD, S.N. (2003), “Thin-plate Regression Splines”, Journal of the Royal Statistical Society, Series B, 65, 95–114.MATHCrossRefGoogle Scholar
  34. WOOD, S.N. (2006), “Mgcv 1.3, R Package”, http://cran.r-project.org.
  35. YAU, P., KOHN, R., and WOOD, S. (2003), “Bayesian Variable Selection and Model Averaging in High-Dimensional Multinomial Nonparametric Regression”, Journal of Computational and Graphical Statistics, 12, 1–32.CrossRefMathSciNetGoogle Scholar
  36. ZHANG, D., and LIN, X. (2003), “Hypothesis Testing in Semiparametric Additive Mixed Models”, Biostatistics, 4, 57–74.MATHCrossRefGoogle Scholar
  37. ZHAO, Y., STAUDENMAYER, J., COULL, B.A., and WAND, M.P. (2006), “General Design Bayesian Generalized Linear Mixed Models”, Statistical Science, 21, 35–51.MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Faculty of EconomicsUniversity BielefeldBielefeldGermany
  2. 2.School of Mathematics and StatisticsUniversity of New South WalesSydneyAustralia
  3. 3.School of Mathematics and Applied StatisticsUniversity of WollongongWollongongAustralia

Personalised recommendations