Mining Health Claims Data for Assessing Patient Risk

Part of the Intelligent Systems Reference Library book series (ISRL, volume 25)


As all countries struggle with rising medical costs and increased demand for services, there is enormous need and opportunity for mining claims and encounter data to predict risk. This chapter discusses the important topic, to health systems and other payers, of the identification and modeling of health risk. We begin with a definition of health risk that focuses on the frequency and severity of the events that cause patients to use healthcare services. The distribution of risk among members of a population is highly skewed, with a few members using disproportionate amounts of resources, and the large majority using more moderate resources. An important modeling challenge to health analysts and actuaries is the prediction of those members of the population whose experience will place them in the tail of the distribution with low frequency but high severity. Actuaries have traditionally modeled risk using age and sex, and other factors (such as geography and employer industry) to predict resource use. We review typical actuarial models and then evaluate the potential for increasing the relevance and accuracy of risk prediction using medical condition-based models. We discuss the types of data frequently available to analysts in health systems which generate medical and drug claims, and their interpretation. We also develop a simple grouper model to illustrate the principle of “grouping” of diagnosis codes for analysis. We examine in more depth the process of developing algorithms to identify the medical condition(s) present in a population as the basis for predicting risk, and conclude with a discussion of some of the commercially-available grouper models used for this purpose in the U.S. and other countries.


Claim Data Diagnosis Code Health Risk Assessment Pharmacy Claim Drug Claim 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dunn, D.L., Rosenblatt, A., Taira, D.A., et al.: A comparative Analysis of Methods of Health Risk Assessment. In: Society of Actuaries (SOA Monograph M-HB96-1), pp. 1–88 (October 1996)Google Scholar
  2. 2.
    Cumming, R.B., Cameron, B.A., Derrick, B., et al.: Comparative Analysis of Claims-Based Methods of Health Risk Assessment for Commercial Populations. Research Study Sponsored by Society of Actuaries (2002)Google Scholar
  3. 3.
    Winkelman, R., Mehmud, S.: A Comparative Analysis of Claims-Based Tools for Health Risk Assessment. In: Society of Actuaries, pp. 1–63 (April 2007),
  4. 4.
    Duncan, I.: Healthcare Risk Adjustment and Predictive Modeling, pp. 1–341. Actex Publications (2011) ISBN 978-1-56698-769-1Google Scholar
  5. 5.
    Duncan, I. (ed.): Dictionary of Disease Management Terminology. Disease Management Association of America, Washington, D.C (2006) (now Care Continuum Alliance) Google Scholar
  6. 6.
    Duncan, I.: Managing and Evaluating Healthcare Intervention Programs, pp. 1–314. Actex Publications (2008) ISBN 978-1-56698-656-4Google Scholar
  7. 7.
    Bluhm, W. (ed.): Group Insurance, 5th edn., pp. 1–1056. Actex Publications (2005) ISBN 978-1-56698-613-7Google Scholar
  8. 8.
    Care Continuum Alliance. Outcomes Guidelines Report - Volume 5. Care Continuum Alliance (formerly DMAA) (5), 1–127 (May 2010),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Dept. of Statistics & Applied ProbabilityUniversity of California Santa BarbaraSanta BarbaraUSA

Personalised recommendations