Regression Analysis

Etzioni, Ruth; Mandel, Micha; Gulati, Roman

doi:10.1007/978-3-030-59889-1_3

Ruth Etzioni⁷,
Micha Mandel⁸ &
Roman Gulati⁹

Part of the book series: Springer Texts in Statistics ((STS))

2354 Accesses
1 Citations

Abstract

This chapter introduces regression analysis, the cornerstone of hypothesis-driven inquiry about health care outcomes. Regression analysis is the quantitative framework that is most commonly used to establish whether outcomes are associated with individual, community, or environmental characteristics. It quantifies the strength of relationships in conceptual models of health care utilization and costs. It provides a framework for explaining why some people incur extremely high health care expenses and why others barely cost anything. It estimates effects of health interventions. And it enables prediction of future costs and outcomes. This chapter presumes a basic knowledge of the concepts of linear regression (also known as ordinary least squares regression). We do not focus on mathematical details; rather, we present the critical ideas that form a practical foundation for regression analysis using observational health care databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Many textbooks present this test in terms of the residual sum of squares (RSS) instead of the variance as used here. These are equivalent approaches since RSS = (n − 1) ×Var_res, where n is the number of observations in the data.
2.
Here and everywhere else in this text, the “\(\log \)” function refers to the natural logarithm, which is sometimes denoted “\(\ln \).”

References

Gaskin, D.J., Richard, P.: The economic costs of pain in the United States. J. Pain 13(8), 715–724 (2012)
Article Google Scholar
Centers for Disease Control and Prevention: National health and nutrition examination survey (2020). https://www.cdc.gov/nchs/nhanes/index.htm. Accessed Feb. 12 2020
Centers for Disease Control and Prevention: Prevalence of obesity and severe obesity among adults: United States, 2017–2018 (2020). https://www.cdc.gov/nchs/products/databriefs/db360.htm. Accessed July 19 2020
Lumley, T., Diehr, P., Emerson, S., Chen, L.: The importance of the normality assumption in large public health data sets. Annu. Rev. Public Health 23(1), 151–169 (2002)
Article Google Scholar
Buse, A.: The likelihood ratio, Wald, and Lagrange multiplier tests: an expository note. Am. Statist. 36(3), 153–157 (1982)
Article Google Scholar
Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge (2006)
Book Google Scholar
Wasserman, L., Roeder, K.: High-dimensional variable selection. Ann. Statist. 37, 2178–2201 (2009)
Article MathSciNet Google Scholar
Taylor, J., Tibshirani, R.J.: Statistical learning and selective inference. Proc. Natl. Acad. Sci. 112, 7629–7634 (2015)
Article MathSciNet Google Scholar
Hong, L., Kuffner, T.A., Martin, R.: On overfitting and post-selection uncertainty assessments. Biometrika 105, 221–224 (2018)
Article MathSciNet Google Scholar
Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Monographs on Statistics and Applied Probability, vol. 43. Chapman & Hall/CRC, Boca Raton (1990)
Google Scholar
Koenker, R.: quantreg: quantile regression (2019). https://CRAN.R-project.org/package=quantreg. R package version 5.52
Endres, C.J.: nhanesA: NHANES data retrieval (2018). https://cran.r-project.org/web/packages/nhanesA/index.html. R package version 0.6.5

Download references

Author information

Authors and Affiliations

Fred Hutchinson Cancer Research Center, University of Washington, Seattle, WA, USA
Ruth Etzioni
Department of Statistics and Data Science, Hebrew University of Jerusalem, Mount Scopus, Jerusalem, Israel
Micha Mandel
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Roman Gulati

Authors

Ruth Etzioni
View author publications
You can also search for this author in PubMed Google Scholar
Micha Mandel
View author publications
You can also search for this author in PubMed Google Scholar
Roman Gulati
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Etzioni, R., Mandel, M., Gulati, R. (2020). Regression Analysis. In: Statistics for Health Data Science. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-59889-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-59889-1_3
Published: 05 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59888-4
Online ISBN: 978-3-030-59889-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics