Overfitting and Optimism in Prediction Models

Steyerberg, Ewout W.

doi:10.1007/978-3-030-16399-0_5

Ewout W. Steyerberg PhD⁴

Part of the book series: Statistics for Biology and Health ((SBH))

8572 Accesses
25 Citations

Abstract

If we develop a statistical model with the main aim of outcome prediction, we are primarily interested in the validity of the predictions for new subjects, outside the sample under study. A key threat to validity is overfitting: the data under study are well described, but predictions are not valid for new subjects. Overfitting causes optimism about a model’s performance in new subjects. After introducing overfitting and optimism, we illustrate overfitting with a simple example of comparisons of mortality figures by hospital. We find that we would exaggerate any true patterns of differences between centers, if we would use the observed average outcomes per center as predictions of mortality. Bootstrap resampling is presented as a central technique to correct overfitting and quantify optimism in model performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
Ewout W. Steyerberg PhD

Authors

Ewout W. Steyerberg PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ewout W. Steyerberg PhD .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Steyerberg, E.W. (2019). Overfitting and Optimism in Prediction Models. In: Clinical Prediction Models. Statistics for Biology and Health. Springer, Cham. https://doi.org/10.1007/978-3-030-16399-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-16399-0_5
Published: 23 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16398-3
Online ISBN: 978-3-030-16399-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics