Abstract
Healthcare analytics refers to data analytic methods applied in the healthcare domain. Healthcare analytics is becoming a prominent data science domain because of the societal and economic burden of disease and the opportunities to better understand the healthcare system through the analysis of data. This chapter introduces the reader to the domain through the analysis of diabetes prevalence and incidence. The data are drawn from the Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
An affordable care organization (ACO) is a network of physicians and hospitals that provide patient care. ACO’s have a responsibility to insure quality care and limit expenditures while allowing patients some freedom in selecting specific medical services.
- 2.
The tutorials of this chapter will reveal substantial geographic differences in prevalence and incidence across the United States.
- 3.
It’s controllable in the sense that related conditions such as retinopathy can be avoided or delayed.
- 4.
We’ve discussed and used BRFSS data in Chap. 3
- 5.
The sampling weights reflect the likelihood selecting a particular respondent but are not the probability of selecting the respondent.
- 6.
If incidence is approximately constant over the interval, \(\widehat{\beta }_{0,i}\) is a more precise estimator of prevalence at the midpoint of the time span.
- 7.
The value labels for a specific question are usually the same from year to year.
- 8.
This is the question asked in the year 2004 survey. The exact phrasing has changed over time.
- 9.
- 10.
Federal Information Processing Standards
- 11.
- 12.
The number pairs m will be 15 except for Louisiana and some U.S. territories.
- 13.
If the BRFSS samples were random samples, then we would call the probability estimate an empirical probability.
- 14.
The precision of a prediction is directly related to the variance of the estimator, and the variance depends on the number of observations used to compute the estimate.
- 15.
Other variables are potentially useful for prediction (race and exercise level).
- 16.
Not every possible profile was observed. The number of observed profiles was 14,270, slightly less than 14,784.
- 17.
We could define the event of interest more rigorously as metabolic syndrome, a set of medical conditions that are considered to be precursors to type 2 diabetes.
- 18.
functions.py should reside in a directory below parent. For instance, the full path might be /home/HealthCare/PythonScripts/functions.py, in which case parent is /home/HealthCare.
- 19.
The algorithm is essentially an implementation of the one-nearest neighbor prediction function.
- 20.
In the tutorial of Sect. 7.5, the predictor variables are age, education, income, and body mass index and so p = 4.
- 21.
A cohort is a population subgroup with similar characteristics
- 22.
In the unlikely event that the target profile is not in the dictionary, we find a set of most similar profiles in the dictionary.
References
C.C. Aggarwal, Data Mining - The Textbook (Springer, New York, 2015)
American Diabetes Association, http://www.diabetes.org/diabetes-basics/statistics/. Accessed 15 June 2016
Centers for Disease Control and Prevention, Behavioral Risk Factor Surveillance System Weighting BRFSS Data (2013). http://www.cdc.gov/brfss/annual_data/2013/pdf/Weighting_Data.pdf
Centers for Disease Control and Prevention, The BRFSS Data User Guide (2013). http://www.cdc.gov/brfss/data_documentation/pdf/userguidejune2013.pdf
E.L. Korn, B.I. Graubard, Examples of differing weighted and unweighted estimates from a sample survey. Am. Stat. 49 (3), 291–295 (1995)
X. Zhuo, P. Zhang, T.J. Hoerger, Lifetime direct medical costs of treating type 2 diabetes and diabetic complications. Am. J. Prev. Med. 45 (3), 253–256 (2013)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Steele, B., Chandler, J., Reddy, S. (2016). Healthcare Analytics. In: Algorithms for Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-45797-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-45797-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45795-6
Online ISBN: 978-3-319-45797-0
eBook Packages: Computer ScienceComputer Science (R0)