Understanding the Effects of Sampling on Healthcare Risk Modeling for the Prediction of Future High-Cost Patients
- Cite this paper as:
- Moturu S.T., Liu H., Johnson W.G. (2008) Understanding the Effects of Sampling on Healthcare Risk Modeling for the Prediction of Future High-Cost Patients. In: Fred A., Filipe J., Gamboa H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2008. Communications in Computer and Information Science, vol 25. Springer, Berlin, Heidelberg
Rapidly rising healthcare costs represent one of the major issues plaguing the healthcare system. Data from the Arizona Health Care Cost Containment System, Arizona’s Medicaid program provide a unique opportunity to exploit state-of-the-art machine learning and data mining algorithms to analyze data and provide actionable findings that can aid cost containment. Our work addresses specific challenges in this real-life healthcare application with respect to data imbalance in the process of building predictive risk models for forecasting high-cost patients. We survey the literature and propose novel data mining approaches customized for this compelling application with specific focus on non-random sampling. Our empirical study indicates that the proposed approach is highly effective and can benefit further research on cost containment in the healthcare industry.
KeywordsPredictive risk modeling health care expenditures Medicaid future high-cost patients data mining non-random sampling risk adjustment skewed data imbalanced data classification
Unable to display preview. Download preview PDF.