Healthcare Analytics

Steele, Brian; Chandler, John; Reddy, Swarna

doi:10.1007/978-3-319-45797-0_7

Brian Steele⁴,
John Chandler⁵ &
Swarna Reddy⁶

7210 Accesses

Abstract

Healthcare analytics refers to data analytic methods applied in the healthcare domain. Healthcare analytics is becoming a prominent data science domain because of the societal and economic burden of disease and the opportunities to better understand the healthcare system through the analysis of data. This chapter introduces the reader to the domain through the analysis of diabetes prevalence and incidence. The data are drawn from the Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
An affordable care organization (ACO) is a network of physicians and hospitals that provide patient care. ACO’s have a responsibility to insure quality care and limit expenditures while allowing patients some freedom in selecting specific medical services.
2.
The tutorials of this chapter will reveal substantial geographic differences in prevalence and incidence across the United States.
3.
It’s controllable in the sense that related conditions such as retinopathy can be avoided or delayed.
4.
We’ve discussed and used BRFSS data in Chap. 3
5.
The sampling weights reflect the likelihood selecting a particular respondent but are not the probability of selecting the respondent.
6.
If incidence is approximately constant over the interval, \(\widehat{\beta }_{0,i}\) is a more precise estimator of prevalence at the midpoint of the time span.
7.
The value labels for a specific question are usually the same from year to year.
8.
This is the question asked in the year 2004 survey. The exact phrasing has changed over time.
9.
You may already have some of these from having worked on the tutorial of Chap. 3, Sect. 3.6
10.
Federal Information Processing Standards
11.
Chapter 3 Sect. 3.6 discusses the creation of the functions.py module.
12.
The number pairs m will be 15 except for Louisiana and some U.S. territories.
13.
If the BRFSS samples were random samples, then we would call the probability estimate an empirical probability.
14.
The precision of a prediction is directly related to the variance of the estimator, and the variance depends on the number of observations used to compute the estimate.
15.
Other variables are potentially useful for prediction (race and exercise level).
16.
Not every possible profile was observed. The number of observed profiles was 14,270, slightly less than 14,784.
17.
We could define the event of interest more rigorously as metabolic syndrome, a set of medical conditions that are considered to be precursors to type 2 diabetes.
18.
functions.py should reside in a directory below parent. For instance, the full path might be /home/HealthCare/PythonScripts/functions.py, in which case parent is /home/HealthCare.
19.
The algorithm is essentially an implementation of the one-nearest neighbor prediction function.
20.
In the tutorial of Sect. 7.5, the predictor variables are age, education, income, and body mass index and so p = 4.
21.
A cohort is a population subgroup with similar characteristics
22.
In the unlikely event that the target profile is not in the dictionary, we find a set of most similar profiles in the dictionary.

References

C.C. Aggarwal, Data Mining - The Textbook (Springer, New York, 2015)
MATH Google Scholar
American Diabetes Association, http://www.diabetes.org/diabetes-basics/statistics/. Accessed 15 June 2016
Centers for Disease Control and Prevention, Behavioral Risk Factor Surveillance System Weighting BRFSS Data (2013). http://www.cdc.gov/brfss/annual_data/2013/pdf/Weighting_Data.pdf
Centers for Disease Control and Prevention, The BRFSS Data User Guide (2013). http://www.cdc.gov/brfss/data_documentation/pdf/userguidejune2013.pdf
E.L. Korn, B.I. Graubard, Examples of differing weighted and unweighted estimates from a sample survey. Am. Stat. 49 (3), 291–295 (1995)
Google Scholar
X. Zhuo, P. Zhang, T.J. Hoerger, Lifetime direct medical costs of treating type 2 diabetes and diabetic complications. Am. J. Prev. Med. 45 (3), 253–256 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Montana, Missoula, MT, USA
Brian Steele
School of Business Administration, University of Montana, Missoula, MT, USA
John Chandler
SoftMath Consultants, LLC, Missoula, MT, USA
Swarna Reddy

Authors

Brian Steele
View author publications
You can also search for this author in PubMed Google Scholar
John Chandler
View author publications
You can also search for this author in PubMed Google Scholar
Swarna Reddy
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Steele, B., Chandler, J., Reddy, S. (2016). Healthcare Analytics. In: Algorithms for Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-45797-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-45797-0_7
Published: 27 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45795-6
Online ISBN: 978-3-319-45797-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics