Simple measures of uncertainty for model selection

Liu, Xiaohui; Li, Yuanyuan; Jiang, Jiming

doi:10.1007/s11749-020-00737-9

Simple measures of uncertainty for model selection

Original Paper
Published: 01 November 2020

Volume 30, pages 673–692, (2021)
Cite this article

TEST Aims and scope Submit manuscript

919 Accesses
3 Citations
Explore all metrics

Abstract

We develop two simple measures of uncertainty for a model selection procedure. The first measure is similar in spirit to confidence set in parameter estimation; the second measure is focusing on error in model selection. The proposed methods are simpler, both conceptually and computationally, than the existing measures of uncertainty in model selection. We recognize major differences between model selection and traditional estimation or prediction problems, and propose reasonable frameworks, under which these measures are developed, and their theoretical properties are established. Empirical studies demonstrate performance of the proposed measures, their superiority over the existing methods, and their relevance to real-life applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model Extension and Model Selection

Extending AIC to best subset regression

Article 09 February 2018

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

Article Open access 22 April 2022

References

Akaike H (1973) Information theory as an extension of the maximum likelihood principle. In: Petrov BN, Csaki F (eds) Second International symposium on information theory. Akademiai Kiado, Budapest, pp 267–281
Google Scholar
Bickel PJ, Chen A (2009) A nonparametric view of network models and Newman-Girvan and other modularities. PNAS 106:21068–21073
Article Google Scholar
Chen L, Giannakouros P, Yang Y (2007) Model combining in factorial data analysis. J Stat Plan Inference 137:2920–2934
Article MathSciNet Google Scholar
Chipman H, George EI, McCulloch RE, Clyde M, Foster DP, Stine RA (2001) The practical implementation of Bayesian model selection. Lecture notes-monograph series, pp 65–134
Claeskens G, Hjort N (2008) Model selection and model averaging. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge
MATH Google Scholar
Datta GS, Hall P, Mandal A (2011) Model selection by testing for the presence of small-area effects, and applications to area-level data. J Am Stat Assoc 106:361–374
Article MathSciNet Google Scholar
Efron B (1979) Bootstrap method: another look at the jackknife. Ann Stat 7:1–26
Article MathSciNet Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499
Article MathSciNet Google Scholar
Ferrari D, Yang Y (2015) Confidence sets for model selection by F-testing. Stat Sin 25:1637–1658
MathSciNet MATH Google Scholar
Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction and estimation. J Am Stat Assoc 102:359–378
Article MathSciNet Google Scholar
Hansen PR, Lunde A, Nason JM (2011) The model confidence set. Econometrica 79:453–497
Article MathSciNet Google Scholar
Jiang J (2010) Large sample techniques for statistics. Springer, New York
Book Google Scholar
Jiang J, Nguyen T (2015) The fence methods. World Scientific, Sinpapore
Book Google Scholar
Jiang J, Li C, Paul D, Yang C, Zhao H (2016) On high-dimensional misspecified mixed model analysis in genome-wide association study. Ann Stat 44:2127–2160
Article MathSciNet Google Scholar
Jiao Y, Reid K, Smith E (2009) Model selection uncertainty and Bayesian model averaging in Fisheries Recruitment Modeling. In: Beamish RJ, Rothschild BJ (eds) The future of fisheries science in North America. Springer, Cham, pp 505–524
Chapter Google Scholar
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Article MathSciNet Google Scholar
Lahiri P (ed) (2001) Model Selection, IMS Lecture Notes—Monograph Series, vol 38. Institute of Mathematical Statistics, Beachwood
Lim C, Yu B (2016) Estimation stability with cross-validation (ESCV). J Comput Graph Stat 25:464–492
Article MathSciNet Google Scholar
Lubke GH, Campbell I (2016) Inference based on the best-fitting model can contribute to the replication crisis: assessing model selection uncertainty using a bootstrap approach. Struct Equ Model 23:479–490
Article MathSciNet Google Scholar
Lubke GJ, Campbell I, McArtor D, Miller P, Luningham J, van den Berg SM (2017) Assessing model selection uncertainty using a bootstrap approach: an update. Struct Equ Model 24:230–245
Article MathSciNet Google Scholar
Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546
Article Google Scholar
Nan Y, Yang Y (2014) Variable selection diagnostics measures for high-dimensional regression. J Comput Graph Stat 23:636–656
Article MathSciNet Google Scholar
Pang Z, Lin B, Jiang J (2016) Regularisation parameter selection via bootstrapping. Aust N Z J Stat 58:335–356
Article MathSciNet Google Scholar
Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
MathSciNet Google Scholar
Shen X, Pan W, Zhu Y (2012) Likelihood-based selection and sharp parameter estimation. J Am Stat Assoc 107:223–232
Article MathSciNet Google Scholar
Shibata R (1976) Selection of the order of an autoregressive model by Akaike’s information criterion. Biometrika 63:117–126
Article MathSciNet Google Scholar
Tibshirani RJ (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B 16:385–395
MATH Google Scholar
Xie M, Singh K, Zhang C-H (2009) Confidence intervals for population ranks in the presence of ties and near ties. J Am Stat Assoc 104:775–788
Article MathSciNet Google Scholar
Yu Y, Yang Y, Yang Y (2017) Performance assessment of high-dimensional variable identification. arXiv:1704.08810
Yuan Z, Yang Y (2005) Combining linear regression models: when and how? J Am Stat Assoc 100:1202–1204
Article MathSciNet Google Scholar
Zheng C, Ferrari D, Yang Y (2019a) Model selection confidence sets by likelihood ratio testing. Stat Sin 29:827–851
MathSciNet MATH Google Scholar
Zheng C, Ferrari D, Zhang M, Baird P (2019b) Ranking the importance of genetic factors by variable-selection confidence sets. J R Stat Soc Ser C (Appl Stat) 68:727–749
Article MathSciNet Google Scholar
Zou H (2006) The adaptive Lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Article MathSciNet Google Scholar

Download references

Acknowledgements

Xiaohui Liu’s research is supported by NNSF of China (Grant Nos. 11601197 and 11461029), China Postdoctoral Science Foundation funded project (2016M600511, 2017T100475), and NSF of Jiangxi Province (Nos. 2017ACB21030, 2018ACB21002). The research of Jiming Jiang is partially supported by the NSF Grants DMS-1510219 and DMS-1713120. The authors are grateful to comments from an Associate Editor and two referees that have led to substantial improvement of the manuscript.

Author information

Authors and Affiliations

Jiangxi University of Finance and Economics, Nanchang, China
Xiaohui Liu
University of California, Davis, Davis, USA
Yuanyuan Li & Jiming Jiang

Authors

Xiaohui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiming Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiming Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 478 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Li, Y. & Jiang, J. Simple measures of uncertainty for model selection. TEST 30, 673–692 (2021). https://doi.org/10.1007/s11749-020-00737-9

Download citation

Received: 11 April 2020
Accepted: 23 October 2020
Published: 01 November 2020
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11749-020-00737-9

Keywords

Mathematics Subject Classification

62A99

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simple measures of uncertainty for model selection

Abstract

Access this article

Similar content being viewed by others

Model Extension and Model Selection

Extending AIC to best subset regression

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 478 KB)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Simple measures of uncertainty for model selection

Abstract

Access this article

Similar content being viewed by others

Model Extension and Model Selection

Extending AIC to best subset regression

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 478 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation