Investigation about a screening step in model selection

Sauerbrei, Willi; Holländer, Norbert; Buchholz, Anika

doi:10.1007/s11222-007-9048-5

Investigation about a screening step in model selection

Published: 09 January 2008

Volume 18, pages 195–208, (2008)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Willi Sauerbrei¹,
Norbert Holländer¹ &
Anika Buchholz¹

241 Accesses
13 Citations
Explore all metrics

Abstract

In many studies a large number of variables is measured and the identification of relevant variables influencing an outcome is an important task. For variable selection several procedures are available. However, focusing on one model only neglects that there usually exist other equally appropriate models. Bayesian or frequentist model averaging approaches have been proposed to improve the development of a predictor. With a larger number of variables (say more than ten variables) the resulting class of models can be very large. For Bayesian model averaging Occam’s window is a popular approach to reduce the model space. As this approach may not eliminate any variables, a variable screening step was proposed for a frequentist model averaging procedure. Based on the results of selected models in bootstrap samples, variables are eliminated before deriving a model averaging predictor. As a simple alternative screening procedure backward elimination can be used.

Through two examples and by means of simulation we investigate some properties of the screening step. In the simulation study we consider situations with fifteen and 25 variables, respectively, of which seven have an influence on the outcome. With the screening step most of the uninfluential variables will be eliminated, but also some variables with a weak effect. Variable screening leads to more applicable models without eliminating models, which are more strongly supported by the data. Furthermore, we give recommendations for important parameters of the screening step.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B., Csaki, B. (eds.) Second International Symposium on Information Theory, pp. 267–281. Academiai Kiado, Budapest (1973)
Google Scholar
Augustin, N.H., Sauerbrei, W., Schumacher, M.: The practical utility of incorporating model selection uncertainty into prognostic models for survival data. Stat. Model. 5, 95–118 (2005)
Article MATH MathSciNet Google Scholar
Buchholz, A., Sauerbrei, W., Holländer, N.: On properties of predictors derived with a two-step bootstrap model averaging approach—a simulation study in the linear regression model. Comput. Stat. Data Anal. (2007, in press)
Buckland, S.T., Burnham, K.P., Augustin, N.H.: Model selection: an integral part of inference. Biometrics 53, 603–618 (1997)
Article MATH Google Scholar
Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: a Practical Information Theoretic Approach. Springer, New York (2002)
MATH Google Scholar
Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304 (2004)
Article MathSciNet Google Scholar
Chatfield, C.: Model uncertainty, data mining and statistical inference. J. R. Stat. Soc. Ser. A 158, 419–466 (1995)
Article Google Scholar
Draper, D.: Assessment and propagation of model selection uncertainty (with) discussion. J. R. Stat. Soc. Ser. B 57, 45–97 (1995)
MATH MathSciNet Google Scholar
Harrell, F.E.J.: Regression Modeling Strategies. Springer, New York (2001)
MATH Google Scholar
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial (with discussion). Stat. Sci. 14, 382–417 (1999)
Article MATH MathSciNet Google Scholar
Holländer, N., Sauerbrei, W., Schumacher, M.: Confidence intervals for the effect of a prognostic factor after selection of an ‘optimal’ cutpoint. Stat. Med. 23, 1701–1713 (2004)
Article Google Scholar
Holländer, N., Augustin, N.H., Sauerbrei, W.: Investigation on the improvement of prediction by bootstrap model averaging. Methods Inf. Med. 45, 44–50 (2006)
Google Scholar
Hosmer, D.W., Lemeshow, S.: Applied Logistic Regression. Wiley, New York (2001)
Google Scholar
Johnson, R.W.: Fitting percentage of body fat to simple body measurements. J. Stat. Educ. 4 (1996)
Kuha, J.: AIC and BIC: Comparison of assumptions and performance. Sociol. Methods Res. 33, 188–229 (2004)
Article MathSciNet Google Scholar
Mantel, N.: Why stepdown procedures in variable selection. Technometrics 12, 621–625 (1970)
Article Google Scholar
Raftery, A.E.: Bayesian model selection in social research (with discussion). Sociol. Methodol. 25, 111–195 (1995)
Article Google Scholar
Sauerbrei, W.: The use of resampling methods to simplify regression models in medical statistics. J. R. Stat. Soc. Ser. C 48, 313–329 (1999)
Article MATH Google Scholar
Sauerbrei, W., Schumacher, M.: A boostrap resampling procedure for model building: application to the Cox regression model. Stat. Med. 11, 2093–2109 (1992)
Article Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Article MATH Google Scholar
Teräsvirta, T., Mellin, I.: Model selection criteria and model selection tests in regression models. Scand. J. Stat. 13, 159–171 (1986)
Google Scholar
Wyatt, J.C., Altman, D.G.: Prognostic models: clinically useful or quickly forgotten? Br. Med. J. 311, 1539–1541 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Stefan-Meier Str. 26, 79104, Freiburg, Germany
Willi Sauerbrei, Norbert Holländer & Anika Buchholz

Authors

Willi Sauerbrei
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Holländer
View author publications
You can also search for this author in PubMed Google Scholar
Anika Buchholz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Willi Sauerbrei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sauerbrei, W., Holländer, N. & Buchholz, A. Investigation about a screening step in model selection. Stat Comput 18, 195–208 (2008). https://doi.org/10.1007/s11222-007-9048-5

Download citation

Received: 05 January 2007
Accepted: 14 December 2007
Published: 09 January 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s11222-007-9048-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigation about a screening step in model selection

Abstract

Access this article

Similar content being viewed by others

Comparison of Bayesian predictive methods for model selection

A Consistent Likelihood-Based Variable Selection Method in Normal Multivariate Linear Regression

Using reference models in variable selection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Investigation about a screening step in model selection

Abstract

Access this article

Similar content being viewed by others

Comparison of Bayesian predictive methods for model selection

A Consistent Likelihood-Based Variable Selection Method in Normal Multivariate Linear Regression

Using reference models in variable selection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation