Abstract
Walter et al. (2001) developed a model to identify older adults at high risk of death in the first year after hospitalization, using data collected for 2,922 patients discharged from two hospitals in Ohio. Potential predictors included demographics, activities of daily living (ADLs), the APACHE-II illness-severity score, and information about the index hospitalization. A “backward” selection procedure with a restrictive inclusion criterion was used to choose a multipredictor model, using data from one of the two hospitals. The model was then validated using data from the other hospital. The goal was to select a model that best predicted future events, with a view toward identifying patients in need of more intensive monitoring and intervention.
Keywords
- Bayesian Information Criterion
- Treatment Effect Estimate
- Primary Predictor
- Candidate Predictor
- Maternal Weight Gain
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allen, D. M. and Cady, F. B. (1982). Analyzing Experimental Data by Regression. Wadsworth, Belmont, CA.
Altman, D. G. and Andersen, P. K. (1989). Bootstrap investigation of the stability of the Cox regression model. Statistics in Medicine, 8, 771–783.
Altman, D. G. and Royston, P. (2000). What do we mean by validating a prognostic model? Statistics in Medicine, 19, 453–473.
Antman, E. M., Cohen, M., Bernink, P. J. L. M., McCabe, C. H., Horaceck, T., Papuchis, G., Mautner, B., Corbalan, R., Radley, D. and Braunwald, E. (2000). The TIMI Risk Score for unstable angina/non-ST elevation MI. Journal of the American Medical Association, 284(7), 835–842.
Beach, M. L. and Meier, P. (1989). Choosing covariates in the analysis of clinical trials. Controlled Clinical Trials, 10, 161S–175S.
Begg, M. D. and Lagakos, S. (1993). Loss in efficiency caused by omitted covariates and misspecifying exposure in logistic regression models. Journal of the American Statistical Association, 88(421), 166–170.
Breiman, L. (2001). Statistical modeling: the two cultures. Statistical Science, 16(3), 199–231.
Brown, J., Vittinghoff, E., Wyman, J. F., Stone, K. L., Nevitt, M. C., Ensrud, K. E. and Grady, D. (2000). Urinary incontinence: does it increase risk for falls and fractures? Study of Osteoporotic Fractures Research Group. Journal of the American Geriatric Society, B48, 721–725.
Buckland, S. T., Burnham, K. P. and Augustin, N. H. (1997). Model selection: an integral part of inference. Biometrics, 53, 603–618.
Chatfield, C. (1995). Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society, Series A, 158, 419–466.
Cook, N. R., Buring, J. E. and Ridker, P. M. (2006). The effective of including C-reactive protein in cardiovascular risk prediction models for women. Annals of Internal Medicine, 145, 21–29.
Crager, M. R. (1987). Analysis of covariance in parallel-group clinical trials with pretreatment baselines. Biometrics, 43(4), 895–901.
D’Agostino, R. B., Russell, M. W., Huse, D. M., Ellison, C., Silberhatz, H., Wilson, P. W. F. and Hartz, S. C. (2000). Primary and subsequent coronary risk appraisal: new results from the Framingham Study. American Heart Journal, 139, 272–281.
Gail, M. H., Tan, W. Y. and Piantodosi, S. (1988). Tests for no treatment effect in randomized clinical trials. Biometrika, 75, 57–64.
Gail, M. H., Wieand, S. and Piantodosi, S. (1984). Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika, 71, 431–444.
Glymour, M. M., Weuve, J., Berkman, L. F., Kawachi, I. and Robins, J. M. (2005). When is baseline adjustment useful in analyses of change? an example with education and cognitive change. American Journal of Epidemiology, 163(3), 267–278.
Gordon, W., Polansky, J., Boscardin, W., Fung, K. and Steinman, M. (2010). Coronary risk assessment by point-based and equation-based Framingham models: significant implications for clinical care. Journal of General Internal Medicine, 25(11), 1145–51.
Greenland, S. (1989). Modeling and variable selection in epidemiologic analysis. American Journal of Public Health, 79(3), 340–349.
Greenland, S. (2003). Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology, 14(3), 300–306.
Greenland, S. and Brumback, B. (2002). An overview of relations among causal modeling methods. International Journal of Epidemiology, 31(5), 1030–1037.
Greenland, S., Pearl, J. and Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology, 10, 37–48.
Grodstein, F., Manson, J. E. and Stampfer, M. J. (2001). Postmenopausal hormone use and secondary prevention of coronary events in the Nurses’ Health Study. Annals of Internal Medicine, 135, 1–8.
Harrell, F. E. (2005). Regression Modeling Strategies. Springer, New York.
Harrell, F. E., Lee, K. L., Califf, R. M., Pryor, D. B. and Rosati, R. A. (1984). Regression modelling strategies for improved prognostic prediction. Statistics in Medicine, 3, 143–152.
Hauck, W. W., Anderson, S. and Marcus, S. M. (1998). Should we adjust for covariates in nonlinear regression analyses of randomized trials? Controlled Clinical Trials, 19, 249–256.
Henderson, R. and Oman, P. (1999). Effect of frailty on marginal regression estimates in survival analysis. Journal of the Royal Statistical Society, Series B, Methodological, 61, 367–379.
Herńan, M. A., Hernández-Díaz, S. and Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15(5), 615–625.
Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: biased estimates for nonorthogonal problems. Technometrics, 12, 55–67.
Hosmer, D. W. and Lemeshow, S. (2000). Applied Logistic Regression. John Wiley & Sons, New York, Chichester.
Hulley, S., Grady, D., Bush, T., Furberg, C., Herrington, D., Riggs, B. and Vittinghoff, E. (1998). Randomized trial of estrogen plus progestin for secondary prevention of heart disease in postmenopausal women. The Heart and Estrogen/progestin Replacement Study. Journal of the American Medical Association, 280(7), 605–613.
Jewell, N. P. (2004). Statistics for Epidemiology. Chapman & Hall/CRC, Boca Raton, FL.
Kanaya, A., Vittinghoff, E., Shlipak, M. G., Resnick, H. E., Visser, M., Grady, D. and Barrett-Connor, E. (2004). Association of total and central obesity with mortality in postmenopausal women with coronary heart disease. American Journal of Epidemiology, 158(12), 1161–1170.
Lagakos, S. W. and Schoenfeld, D. A. (1984). Properties of proportional-hazards score tests under misspecified regression models. Biometrics, 40, 1037–1048.
Linhart, H. and Zucchini, W. (1986). Model Selection. John Wiley & Sons, New York, Chichester.
Maldonado, G. and Greenland, S. (1993). Simulation study of confounder-selection strategies. American Journal of Epidemiology, 138, 923–936.
Meier, P., Ferguson, D. J. and Karrison, T. (1985). A controlled trial of extended radical mastectomy. Cancer, 55, 880–891.
Miller, A. J. (1990). Subset Selection in Regression. Chapman & Hall Ltd, London, New York.
Neuhaus, J. (1998). Estimation efficiency with omitted covariates in generalized linear models. Journal of the American Statistical Association, 93, 1124–1129.
Neuhaus, J. and Jewell, N. P. (1993). A geometric approach to assess bias due to omitted covariates in generalized linear models. Biometrika, 80, 807–815.
Orwoll, E., Bauer, D. C., Vogt, T. M. and Fox, K. M. (1996). Axial bone mass in older women. Annals of Internal Medicine, 124(2), 185–197.
Parzen, M. and Lipsitz, S. R. (1999). A global goodness-of-fit statistic for Cox regression models. Biometrics, 55, 580–584.
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82, 669–688.
Pencina, M. J., D’Agostino Sr, R. B., D’Agostino Jr, R. B. and Vasan, R. S. (2008). Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Statistics in Medicine, 27, 157–172.
Schmoor, C. and Schumacher, M. (1997). Effects of covariate omission and categorization when analysing randomized trials with the Cox model. Statistics in Medicine, 16, 225–237.
Steyerberg, E. W. (2009). Clinical Prediction Models. Springer, New York.
Sun, G. W., Shook, T. L. and Kay, G. L. (1999). Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. Journal of Clinical Epidemiology, 49, 907–916.
Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Statistics in Medicine, 16, 385–395.
van Houwelingen, H. C. (2000). Validation, calibration, revision, and combination of prognostic survival models. Statistics in Medicine, 19, 3401–3415.
Vittinghoff, E. and McCulloch, C. E. (2007). Relaxing the rule of ten events per variable in logistic and Cox regression. American Journal of Epidemiology, 165, 710–718.
Vittinghoff, E., Shlipak, M. G., Varosy, P. D., Furberg, C. D., Ireland, C. C., Khan, S. S., Blumenthal, R., Barrett-Connor, E. and Hulley, S. (2003). Risk factors and secondary prevention in women with heart disease: The Heart and Estrogen/progestin Replacement Study. Annals of Internal Medicine, 138(2), 81–89.
Walter, L. C., Brand, R. J., Counsell, S. R., Palmer, R. M., Landefeld, C. S., Fortinsky, R. H. and Covinsky, K. E. (2001). Development and validation of a prognostic index for 1-year mortality in older adults after hospitalization. Journal of the American Medical Association, 285(23), 2987–2994.
Weisberg, S. (1985). Applied Linear Regression. John Wiley & Sons, New York, Chichester.
Whooley, M., de Jonge, P., Vittinghoff, E., Otte, C., Moos, R., Carney, R., Ali, S., Carney, R., Na, B., Feldman, M., Schiller, N. and Browner, W. (2008). Depressive symptoms, health behaviors, and risk of cardiovascular events in patients with coronary heart disease. Journal of the American Medical Association, 300(20), 2379–2388.
Concato, J., Peduzzi, P. and Holfold, T. R. (1995). Importance of events per independent variable in proportional hazards analysis i. background, goals, and general strategy. Journal of Clinical Epidemiology, 48, 1495–1501.
Grady, D., Wenger, N. K., Herrington, D., Khan, S., Furberg, C., Hunninghake, D., Vittinghoff, E. and Hulley, S. (2000). Postmenopausal hormone therapy increases risk of venous thromboembolic disease. The Heart and Estrogen/progestin Replacement Study. Annals of Internal Medicine, 132(9), 689–696.
Molinaro, A. and van der Laan, M. J. (2004). Deletion/substitution/addition algorithm for partitioning the covariate space in prediction. U.C. Berkeley Division of Biostatistics Working Paper Series, Paper 162.
Peduzzi, P., Concato, J. and Feinstein, A. R. (1995). Importance of events per independent variable in proportional hazards regression analysis ii. accuracy and precision of regression estimates. Journal of Clinical Epidemiology, 48, 1503–1510.
Rothman, K. J. and Greenland, S. (1998). Modern Epidemiology. Lippincott Williams & Wilkins Publishers, Philadelphia, PA, 2nd ed.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Vittinghoff, E., Glidden, D.V., Shiboski, S.C., McCulloch, C.E. (2012). Predictor Selection. In: Regression Methods in Biostatistics. Statistics for Biology and Health. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1353-0_10
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1353-0_10
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1352-3
Online ISBN: 978-1-4614-1353-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)