Skip to main content
Log in

Variable Selection in Joint Mean and Dispersion Models via Double Penalized Likelihood

  • Published:
Sankhya B Aims and scope Submit manuscript

An Erratum to this article was published on 20 November 2014

Abstract

In this paper, we propose to jointly model the conditional mean and variance components associated with the response in multilevel data. We set a generalized linear mixed model (GLMM) for the mean and a generalized linear model (GLM) for the variance components. The variable selection method of our choice is the smoothly clipped absolute deviation (SCAD) penalty, a penalized likelihood variable selection procedure, which shrinks the coefficients of redundant variables to 0 while simultaneously estimating the coefficients of the remaining important covariates. To assess the performance of the proposed procedures, we carry out real data analysis as well as extensive simulation studies, and compare to a similar process which excludes variable selection. We conclude that our method outperforms a simple joint mean-variance modelling approach, in both identifying the important components in the joint models and also producing more efficient estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitkin M., Longford N. (1986) Statistical modelling issues in school effectiveness studies. J. Royal Stat. Soc. Series A, 144: 1–43

    Article  Google Scholar 

  • Antoniadis A. (1997) Wavelets in statistics: a review (with discussion). J. Italian Stat. Assoc., 6: 97–144

    Article  Google Scholar 

  • Bondell H.D., Krishna A., Ghosh S.K. (2010) Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics, 66: 1069–1077

    Article  MATH  MathSciNet  Google Scholar 

  • Breiman L. (1996) Heuristics of instability and stabilization in model selection. Ann. Stat., 24: 2350–2383

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow N.E., Clayton D.G. (1993) Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc., 88: 9–25

    MATH  Google Scholar 

  • Cai B., Dunson D.B. (2006) Bayesian covariance selection in generalized linear mixed models. Biometrics, 62: 446–457

    Article  MATH  MathSciNet  Google Scholar 

  • Chen Z., Dunson D.B. (2003) Random effects selection in linear mixed models. Biometrics, 59: 762–769

    Article  MATH  MathSciNet  Google Scholar 

  • Chiu T.Y.M., Leonard T., Tsui K.W. (1996) The matrix-logarithmic covariance model. J. Am. Stat. Assoc., 91: 198–210

    Article  MATH  MathSciNet  Google Scholar 

  • Cox D.R., Hinkley D.V. (1974) Theoretical statistics, 1st ed, 1st edn. Chapman and Hall, London

    Book  Google Scholar 

  • Daniels M.J., Zhao Y.D. (2003) Modelling the random effects covariance matrix in longitudinal data. Stat. Med., 22: 1631–1647

    Article  Google Scholar 

  • Fan J., Li R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc., 96: 1348–1360

    Article  MATH  MathSciNet  Google Scholar 

  • Frank I.E., Friedman J.H. (1993) A statistical view of some chemometrics regression tools. Technometrics, 35: 109–148

    Article  MATH  Google Scholar 

  • Gelman A., Hill J. (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press

  • Goldstein H. (1991) Nonlinear multilevel models, with an application to discrete response data. Biometrika, 78: 45–51

    Article  MathSciNet  Google Scholar 

  • Goldstein H. (1995) Multilevel statistical models, 2nd ed. London: Edward Arnold; New York: Halstead Press

  • Hall P., Lee E.R., Park B.U. (2009) Bootstrap-based penalty choice for the LASSO, achieving oracle performance. Statistica Sinica, 19: 449–471

    MATH  MathSciNet  Google Scholar 

  • Harvey A.C. (1976) Estimating regression models with multiplicative heteroscedasticity. Econometrica, 44: 461–465

    Article  MATH  MathSciNet  Google Scholar 

  • Hoerl A.E., Kennard R.W. (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12: 55–67

    Article  MATH  Google Scholar 

  • Ibrahim J.G., Zhu H., Garcia R.I., Guo R. (2010) Fixed and random effects selection in mixed effects models. Biometrics, 67: 495–503

    Article  MathSciNet  Google Scholar 

  • Kou C., Pan J.X. (2009) Variable selection for joint mean and covariance models via penalized likelihood. http://www.manchester.ac.uk/mims/eprints. Accessed 18 March 2013

  • Lee Y., Nelder J.A. (2006) Double hierarchical generalized linear models (with discussion). Applied Statistics, 55: 139–185

    MATH  MathSciNet  Google Scholar 

  • Lee Y., Nelder J.A., Pawitan Y. (2006) Generalized linear models with random effects, unified analysis via H-likelihood. Chapman and Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Lehmann E.L. (1983) Theory of point estimation. Wadsworth and Brooks/Cole, Pacific Grove

    Book  MATH  Google Scholar 

  • McCullagh P., Nelder J.A. (1989) Generalized linear models, 2nd ed. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Ni X., Zhang D., Zhang H.H. (2010) Variable selection for semiparametric mixed models in longitudinal studies. Biometrics, 66: 79–88

    Article  MATH  MathSciNet  Google Scholar 

  • Pan J.X., MacKenzie G. (2003) Model selection for joint mean-covariance structures in longitudinal studies. Biometrika, 90: 239–244

    Article  MATH  MathSciNet  Google Scholar 

  • Pan J.X., MacKenzie G. (2006) Regression models for covariance structures in longitudinal studies. Stat. Model., 6: 43–57

    Article  MathSciNet  Google Scholar 

  • Pan J.X., MacKenzie G. (2007) Modelling conditional covariance in the linear mixed model. Stat. Model., 7: 49–71

    Article  MathSciNet  Google Scholar 

  • Park R.E. (1966) Estimation with heteroscedastic error terms. Econometrica, 34: 888

    Article  Google Scholar 

  • Phadke M.S., Kackar R.N., Speeney D.V., Grieco M.J. (1983) Off-line quality control for integrated circuit fabrication using experimental design. Bell Syst. Tech. J., 62: 1273–1309

    Article  Google Scholar 

  • Pötscher B.M., Leeb H. (2009) On the distribution of penalized maximum likelihood estimators: the LASSO, SCAD and thresholding. J. Multivariate Anal., 100: 2065–2082

    Article  MATH  MathSciNet  Google Scholar 

  • Pötscher B.M., Schneider U. (2010) Confidence sets based on penalized maximum likelihood estimators in Gaussian regression. Electron. J. Stat., 4: 334–360

    Article  MATH  MathSciNet  Google Scholar 

  • Pourahmadi M. (1999) Joint mean-covariance models with applications to longitudinal data: unconstrained parameterization. Biometrika, 86: 677–690

    Article  MATH  MathSciNet  Google Scholar 

  • Pourahmadi M. (2000) Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika, 87: 425–435

    Article  MATH  MathSciNet  Google Scholar 

  • Raudenbush S. W., Bryk A. S. (2013) Hierarchical linear models: applications and data analysis methods, 2nd ed. Sage Publications Inc, Newbury Park

    Google Scholar 

  • Rigby R.A., Stasinopoulos D.M. (1996) A semi-parametric additive model for variance heterogeneity. Stat. Comput., 6: 57–65

    Article  Google Scholar 

  • Rigby R.A., Stasinopoulos D.M. (2005) Generalized additive models for location, scale and shape. Appl. Stat., 54: 507–554

    MATH  MathSciNet  Google Scholar 

  • Rutemiller H.C., Bowers D.A. (1968) Estimation in a heteroscedastic regression model. J. Am. Stat. Assoc., 63: 552–557

    MathSciNet  Google Scholar 

  • Scott M. A., Simonoff J. S., Marx B. D. (2013) The SAGE handbook of multilevel modeling. Sage Publications Ltd, London

    Book  Google Scholar 

  • Smyth G.K., Verbyla A.P. (2009) Leverage adjustments for dispersion modelling in generalized nonlinear models. Australian New Zealand J. Stat. 51: 433–448

    Article  MathSciNet  Google Scholar 

  • Tibshirani R.J. (1996) Regression shrinkage and selection via the LASSO. J. Royal Stat. Soc. Series B, 58: 267–288

    MATH  MathSciNet  Google Scholar 

  • Vonesh E.F. (1996) A note on the use of Laplace’s approximation for nonlinear mixed-effects models. Biometrika, 83: 447–452

    Article  MATH  MathSciNet  Google Scholar 

  • Wang H., Li R., Tsai C.-L. (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94: 553–568

    Article  MATH  MathSciNet  Google Scholar 

  • Wolfinger R.D., Tobias R.D. (1998) Joint estimation of location, dispersion, and random effects in robust design. Technometrics, 40: 62–71

    Article  MATH  Google Scholar 

  • Ye H.J., Pan J.X. (2006) Modelling of covariance structures in generalized estimating equations for longitudinal data. Biometrika, 93: 927–941

    Article  MathSciNet  Google Scholar 

  • Zhang Y., Li R., Tsai C.-L. (2010) Regularization parameter selections via generalized information criterion. J. Am. Stat. Assoc., 105: 312–323

    Article  MathSciNet  Google Scholar 

  • Zhang Z.Z., Wang D.R. (2011) Simultaneous variable selection for heteroscedastic regression models. Sci. China Math., 54: 515–530

    Article  MATH  MathSciNet  Google Scholar 

  • Zou H. (2006) The adaptive lasso and its oracle properties. J. Am. Stat. Assoc., 101: 1418–1429

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christiana Charalambous.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 249 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Charalambous, C., Pan, J. & Tranmer, M. Variable Selection in Joint Mean and Dispersion Models via Double Penalized Likelihood. Sankhya B 76, 276–304 (2014). https://doi.org/10.1007/s13571-014-0079-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-014-0079-6

Keywords and phrases

AMS (2000) subject classification.

Navigation