Skip to main content
Log in

Model selection in linear mixed-effect models

  • Statistical Reviews
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Linear mixed-effects models are a class of models widely used for analyzing different types of data: longitudinal, clustered and panel data. Many fields, in which a statistical methodology is required, involve the employment of linear mixed models, such as biology, chemistry, medicine, finance and so forth. One of the most important processes, in a statistical analysis, is given by model selection. Hence, since there are a large number of linear mixed model selection procedures available in the literature, a pressing issue is how to identify the best approach to adopt in a specific case. We outline mainly all approaches focusing on the part of the model subject to selection (fixed and/or random), the dimensionality of models and the structure of variance and covariance matrices, and also, wherever possible, the existence of an implemented application of the methodologies set out.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ahn, M., Zhang, H.H., Lu, W.: Moment-based method for random effects selection in linear mixed models. Stat. Sin. 22(4), 1539 (2012)

    MathSciNet  MATH  Google Scholar 

  • Akaike, H.: Information theory and an extension of the maximum likelihood principle. Breakthroughs in Statistics, pp. 610–624. Springer, Berlin (1992)

    Google Scholar 

  • Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)

    MathSciNet  MATH  Google Scholar 

  • Bozdogan, H.: Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52, 345–370 (1987)

    MathSciNet  MATH  Google Scholar 

  • Braun, J., Held, L., Ledergerber, B.: Predictive cross-validation for the choice of linear mixed-effects models with application to data from the Swiss HIV Cohort Study. Biometrics 68(1), 53–61 (2012)

    MathSciNet  MATH  Google Scholar 

  • Bülmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer, Berlin (2011)

    Google Scholar 

  • Chen, Z., Dunson, D.B.: Random effects selection in linear mixed models. Biometrics 59(4), 762–769 (2003)

    MathSciNet  MATH  Google Scholar 

  • Chen, F., Li, Z., Shi, L., Zhu, L.: Inference for mixed models of anova type with high-dimensional data. J. Multivar. Anal. 133, 382–401 (2015)

    MathSciNet  MATH  Google Scholar 

  • Dimova, R.B., Markatou, M., Talal, A.H.: Information methods for model selection in linear mixed effects models with application to HCV data. Comput. Stat. Data Anal. 55(9), 2677–2697 (2011)

    MathSciNet  MATH  Google Scholar 

  • Fan, Y., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    MathSciNet  MATH  Google Scholar 

  • Fan, Y., Li, R.: Variable selection in linear mixed effects models. Ann. Stat. 40(4), 2043–2068 (2012)

    MathSciNet  MATH  Google Scholar 

  • Fan, Y., Qin, G., Zhu, Z.Y.: Robust variable selection in linear mixed models. Commun. Stat. Theory Methods 43(21), 4566–4581 (2014)

    MathSciNet  MATH  Google Scholar 

  • Frank, I.E., Friedman, J.H.: A statistical view of some chemometric regression tools. Technometrics 35, 109–148 (1993)

    MATH  Google Scholar 

  • Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1, 302–332 (2007)

    MathSciNet  MATH  Google Scholar 

  • Fu, W.J.: Penalized regressions: the bridge versus the LASSO. J. Comput. Graph. Stat. 7, 397–416 (1998)

    MathSciNet  Google Scholar 

  • Ghosh, A., Thoresen, M.: Non-concave penalization in linear mixed-effects models and regularized selection of fixed effects. AStA Adv. Stat. Anal. 102(2), 179–210 (2018)

    MathSciNet  MATH  Google Scholar 

  • Gilmour, S.G.: The interpretation of mallow’s cp statistic. The Statistician 45, 49–56 (1996)

    Google Scholar 

  • Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)

    MathSciNet  MATH  Google Scholar 

  • Greven, S., Kneib, T.: On the behaviour of marginal and conditional aic in linear mixed models. Biometrika 97, 773–789 (2010)

    MathSciNet  MATH  Google Scholar 

  • Han, B.: Conditional akaike information criterion in the Fay-Herriot model. Stat. Methodol. 11, 53–67 (2013)

    MathSciNet  MATH  Google Scholar 

  • Hansen, M.H., Yu, B.: Minimum description length model selection criteria for generalized linear models. Stat. Sci. A Festschrift Terry Speed 40, 145–163 (2003)

    MathSciNet  Google Scholar 

  • Hodges, J.S., Sargent, D.J.: Counting degrees of freedom in hierarchical and other richly-parameterised models. Biometrika 88, 367–379 (2001)

    MathSciNet  MATH  Google Scholar 

  • Hossain, S., Thomson, T., Ahmed, E.: Shrinkage estimation in linear mixed models for longitudinal data. Metrika 81(5), 569–586 (2018)

    MathSciNet  MATH  Google Scholar 

  • Hui, F.K., Müller, S., Welsh, A.: Joint selection in mixed models using regularized PQL. J. Am. Stat. Assoc. 112(519), 1323–1333 (2017)

    MathSciNet  Google Scholar 

  • Ibrahim, J.G., Zhu, H., Garcia, R.I., Guo, R.: Fixed and random effects selection in mixed effects models. Biometrics 67(2), 495–503 (2011)

    MathSciNet  MATH  Google Scholar 

  • Jiang, J., Rao, J.S.: Consistent procedures for mixed linear model selection. Sankhya Ser A 65(1), 23–42 (2003)

    MathSciNet  MATH  Google Scholar 

  • Jiang, J., Rao, J.S., Gu, Z., Nguyen, T., et al.: Fence methods for mixed model selection. Ann. Stat. 36(4), 1669–1692 (2008)

    MathSciNet  MATH  Google Scholar 

  • Jiang, J., Nguyen, T., Rao, J.S.: A simplified adaptive fence procedure. Stat. Probab. Lett. 79, 625–629 (2009)

    MathSciNet  MATH  Google Scholar 

  • Kawakubo, Y., Kubokawa, T.: Modified conditional AIC in linear mixed models. J. Multivar. Anal. 129, 44–56 (2014)

    MathSciNet  MATH  Google Scholar 

  • Kawakubo Y, Sugasawa S, Kubokawa T, et al. (2014) Conditional AIC under covariate shift with application to small area prediction. Technical report, CIRJE, Faculty of Economics, University of Tokyo

  • Kawakubo, Y., Sugasawa, S., Kubokawa, T.: Conditional akaike information under covariate shift with application to small area estimation. Can. J. Stat. 46(2), 316–335 (2018)

    MathSciNet  MATH  Google Scholar 

  • Kubokawa, T.: Conditional and unconditional methods for selecting variables in linear mixed models. J. Multivar. Anal. 102(3), 641–660 (2011)

    MathSciNet  MATH  Google Scholar 

  • Kubokawa, T., Srivastava, M.S.: An empirical Bayes information criterion for selecting variables in linear mixed models. J. Jpn. Stat. Soc. 40(1), 111–131 (2010)

    MathSciNet  Google Scholar 

  • Kuran, Ö., Özkale, M.R.: Model selection via conditional conceptual predictive statistic under ridge regression in linear mixed models. J. Stat. Comput. Simul. 89(1), 155–187 (2019)

    MathSciNet  MATH  Google Scholar 

  • Lahiri, P., Suntornchost, J.: Variable selection for linear mixed models with applications in small area estimation. Sankhya B 77(2), 312–320 (2015)

    MathSciNet  MATH  Google Scholar 

  • Li, Z., Zhu, L.: A new test for random effects in linear mixed models with longitudinal data. J. Stat. Plan. Inference 143(1), 82–95 (2013)

    MathSciNet  MATH  Google Scholar 

  • Li, L., Yao, F., Craiu, R.V., Zou, J.: Minimum description length principle for linear mixed effects models. Stat. Sin. 24, 1161–1178 (2014)

    MathSciNet  MATH  Google Scholar 

  • Li, Y., Wang, S., Song, P.X.K., Wang, N., Zhou, L., Zhu, J.: Doubly regularized estimation and selection in linear mixed-effects models for high-dimensional longitudinal data. Stat. Interface 11(4), 721 (2018)

    MathSciNet  MATH  Google Scholar 

  • Liang, H., Wu, H., Zou, G.: A note on conditional aic for linear mixed-effects models. Biometrika 95, 773–778 (2008)

    MathSciNet  MATH  Google Scholar 

  • Lin, B., Pang, Z., Jiang, J.: Fixed and random effects selection by reml and pathwise coordinate optimization. J. Comput. Graph. Stat. 22(2), 341–355 (2013)

    MathSciNet  Google Scholar 

  • Liski EP, Liski A (2008) Model selection in linear mixed models using mdl criterion with an application to spline smoothing. In: Proceedings of the First Workshop on Information Theoretic Methods in Science and Engineering, Tampere, Finland, pp. 18–20

  • Liu, X.Q., Hu, P.: General ridge predictors in a mixed linear model. Statistics 47(2), 363–378 (2013)

    MathSciNet  MATH  Google Scholar 

  • Lombardía, M.J., López-Vizcaíno, E., Rueda, C.: Mixed generalized Akaike information criterion for small area models. J. R. Stat. Soc. Ser. A Stat. Soc. 180:1229–1252 (2017)

  • Marhuenda, Y., Molina, I., Morales, D.: Small area estimation with spatio-temporal Fay-Herriot models. Comput. Stat. Data Anal. 58, 308–325 (2013)

    MathSciNet  MATH  Google Scholar 

  • Marino, M., Buxton, O.M., Li, Y.: Covariate selection for multilevel models with missing data. Stat 6(1), 31–46 (2017)

    MathSciNet  Google Scholar 

  • Marshall, E.C., Spiegelhalter, D.J.: Approximate cross-validatory predictive checks in disease mapping models. Stat. Med. 22, 1649–1660 (2003)

    Google Scholar 

  • Müller, S., Scealy, J.L., Welsh, A.H., et al.: Model selection in linear mixed models. Stat. Sci. 28(2), 135–167 (2013)

    MathSciNet  MATH  Google Scholar 

  • Nguyen, T., Jiang, J.: Restricted fence method for covariate selection in longitudinal data analysis. Biostatistics 13(2), 303–314 (2012)

    MATH  Google Scholar 

  • Özkale, M.R., Can, F.: An evaluation of ridge estimator in linear mixed models: an example from kidney failure data. J. Appl. Stat. 44(12), 2251–2269 (2017)

    MathSciNet  Google Scholar 

  • Pan J (2016) Adaptive LASSO for mixed model selection via profile log-likelihood. Ph.D. thesis, Bowling Green State University

  • Pan, J., Shang, J.: Adaptive lasso for linear mixed model selection via profile log-likelihood. Commun. Stat. Theory Methods 47(8), 1882–1900 (2018a)

    MathSciNet  MATH  Google Scholar 

  • Pan, J., Shang, J.: A simultaneous variable selection methodology for linear mixed models. J. Stat. Comput. Simul. 88(17), 3323–3337 (2018b)

    MathSciNet  MATH  Google Scholar 

  • Peng, H., Lu, Y.: Model selection in linear mixed effect models. J. Multivar. Anal. 109, 109–129 (2012)

    MathSciNet  MATH  Google Scholar 

  • Pu, W., Niu, X.F.: Selecting mixed-effects models based on a generalized information criterion. J. Multivar. Anal. 97(3), 733–758 (2006)

    MathSciNet  MATH  Google Scholar 

  • Rissanen, J.: Stochastic complexity and modeling. Ann. Stat. 14(3), 1080–1100 (1986)

    MathSciNet  MATH  Google Scholar 

  • Rocha, F.M., Singer, J.M.: Selection of terms in random coefficient regression models. J. Appl. Stat. 45(2), 225–242 (2018)

    MathSciNet  Google Scholar 

  • Rohart, F., San Cristobal, M., Laurent, B.: Selection of fixed effects in high dimensional linear mixed models using a multicycle ecm algorithm. Comput. Stat. Data Anal. 80, 209–222 (2014)

    MathSciNet  MATH  Google Scholar 

  • Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys, vol. 81. Wiley, Hoboken (2004)

    MATH  Google Scholar 

  • Schelldorfer, J., Bühlmann, P., De Geer, S.V.: Estimation for high-dimensional linear mixed-effects models using l1-penalization. Scand. J. Stat. 38(2), 197–214 (2011)

    MathSciNet  MATH  Google Scholar 

  • Schmidt, K., Smith, R.C.: A parameter subset selection algorithm for mixed-effects models. Int. J. Uncertain. Quantif. 6(5), 405–416 (2016)

    MathSciNet  Google Scholar 

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    MathSciNet  MATH  Google Scholar 

  • Sciandra, M., Plaia, A.: A graphical model selection tool for mixed models. Commun. Stat. Simul. Comput. 47(9), 2624–2638 (2018)

    MathSciNet  Google Scholar 

  • Shang, J., Cavanaugh, J.E.: Bootstrap variants of the akaike information criterion for mixed model selection. Comput. Stat. Data Anal. 52(4), 2004–2021 (2008)

    MathSciNet  MATH  Google Scholar 

  • Singer, J.M., Rocha, F.M., Nobre, J.S.: Graphical tools for detecting departures from linear mixed model assumptions and some remedial measures. Int. Stat. Rev. 85(2), 290–324 (2017)

    MathSciNet  Google Scholar 

  • Sorensen, G., Barbeau, E., Stoddard, A.M., Hunt, M.K., Kaphingst, K., Wallace, L.: Promoting behavior change among working-class, multiethnic workers: results of the healthy directions-small business study. Am. J. Public Health 95(8), 1389–1395 (2005)

    Google Scholar 

  • Srivastava, M.S., Kubokawa, T.: Conditional information criteria for selecting variables in linear mixed models. J. Multivar. Anal. 101(9), 1970–1980 (2010)

    MathSciNet  MATH  Google Scholar 

  • Sugiura, N.: Further analysis of the data by akaike’s information criterion and the finite corrections. Commun. Stat. A 7, 13–26 (1978)

    MATH  Google Scholar 

  • Taylor, J.D., Verbyla, A.P., Cavanagh, C., Newberry, M.: Variable selection in linear mixed models using an extended class of penalties. Aust. N. Z. J. Stat. 54(4), 427–449 (2012)

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Vaida, F., Blanchard, S.: Conditional akaike information for mixed-effects models. Biometrika 92(2), 351–370 (2005)

    MathSciNet  MATH  Google Scholar 

  • Wang, W.: Identifiability of covariance parameters in linear mixed effects models. Linear Algebra Appl. 506, 603–613 (2016)

    MathSciNet  MATH  Google Scholar 

  • Wang, J., Schaalje, G.B.: Model selection for linear mixed models using predictive criteria. Commun. Stat. Simul. Comput. 38(4), 788–801 (2009)

    MathSciNet  MATH  Google Scholar 

  • Weiss, R.E.: Modeling Longitudinal Data. Springer, Berlin (2005)

    MATH  Google Scholar 

  • Wenren, C., Shang, J.: Conditional conceptual predictive statistic for mixed model selection. J. Appl. Stat. 43(4), 585–603 (2016)

    MathSciNet  Google Scholar 

  • Wenren, C., Shang, J., Pan, J.: Marginal conceptual predictive statistic for mixed model selection. Open J. Stat. 6(02), 239 (2016)

    Google Scholar 

  • Wu, P., Luo, X., Xu, P., Zhu, L.: New variable selection for linear mixed-effects models. Ann. Inst. Stat. Math. 69, 627–646 (2016)

    MathSciNet  MATH  Google Scholar 

  • Zhang, X., Liang, H., Liu, A., Ruppert, D., Zou, G.: Selection strategy for covariance structure of random effects in linear mixed-effects models. Scand. J. Stat. 43(1), 275–291 (2016)

    MathSciNet  MATH  Google Scholar 

  • Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    MathSciNet  MATH  Google Scholar 

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67(2), 301–320 (2005)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonella Plaia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Buscemi, S., Plaia, A. Model selection in linear mixed-effect models. AStA Adv Stat Anal 104, 529–575 (2020). https://doi.org/10.1007/s10182-019-00359-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-019-00359-z

Keywords

Navigation