Skip to main content
Log in

A predictive modeling approach to increasing the economic effectiveness of disease management programs

  • Published:
Health Care Management Science Aims and scope Submit manuscript

Abstract

Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-)effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  Google Scholar 

  2. Antonio K, Beirlant J (2007) Actuarial statistics with generalized linear mixed models. Insur Math Econ 40(1):58–76

    Article  Google Scholar 

  3. Belitz C, Brezger A, Kneib T, Lang S (2009) BayesX—software for Bayesian inference in structured additive regression models, version 2.0.1. Erhältlich unter: http://www.stat.uni-muenchen.de/~bayesx

  4. Billings J, Mijanovich T (2007) Improving the management of care for high-cost medicaid patients. Health Aff 26(6):1643–1655

    Article  Google Scholar 

  5. Blough DK, Madden CW, Hornbrook MC (1999) Modeling risk using generalized linear models. J Health Econ 18:153–171

    Article  Google Scholar 

  6. Bodenheimer T, Lorig K, Holman H, Grumbach K (2002) Patient self-management of chronic disease in primary care. J Am Med Assoc 288(19):2469–2475

    Article  Google Scholar 

  7. Breiman L (1984) Classification and regression trees. Chapman & Hall/CRC, London

    Google Scholar 

  8. Buntin MB, Zaslavsky AM (2004) Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J Health Econ 23:525–542

    Article  Google Scholar 

  9. Cameron AC, Trivedi PK (1998) Regression analysis of count data. Cambridge University Press, New York

    Book  Google Scholar 

  10. Davison AC (2003) Statistical models. Cambridge University Press, New York

    Book  Google Scholar 

  11. De Jong P, Heller GZ (2008) Generalized linear models for insurance data. Cambridge University Press, Cambridge

    Book  Google Scholar 

  12. Diehr P, Yanez D, Ash A, Hornbrook M, Lin DY (1999) Methods for analyzing health care utilization and costs. Ann Rev Public Health 20:125–144

    Article  Google Scholar 

  13. Duan N, Manning WG, Morris CN, Newhouse JP (1983) A comparison of alternative models for the demand for medical care. J Bus Econ Stat 1(2):115–126

    Google Scholar 

  14. Fahrmeir L, Kneib T (2010) Bayesian smoothing and regression for longitudinal, spatial and event history data. Oxford University Press, London

    Google Scholar 

  15. Francis L (2001) Neural networks demystified. Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/01wforum/01wf253.pdf

  16. Francis L (2003) Martian chronicles: is MARS better than neural networks? Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/03wforum/03wf027.pdf

  17. Freeman R, Lybecker KM, Taylor DW (2011) The effectiveness of disease management programs in the medicaid population. Tech. rep. The Cameron Institute. Available at: http://cameroninstitute.com

  18. Frees EW, Valdez EA (2008) Hierarchical insurance claims modeling. J Am Stat Assoc 103(484):1457–1469

    Article  Google Scholar 

  19. Frees EW, Young VR, Luo Y (1999) A longitudinal data analysis interpretation of credibility models. Insur Math Econ 24:229–247

    Article  Google Scholar 

  20. Freitag AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer Verlag, Berlin

    Book  Google Scholar 

  21. Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–141

    Article  Google Scholar 

  22. Good PI (2005) Introduction to statistics through resampling methods and R/S-PLUS. Wiley, New York

    Book  Google Scholar 

  23. Haberman S, Renshaw AE (1998) Actuarial applications of generalized linear models. In: Hand D, Jacka S (eds) Statistics in finance. Arnold, E., London

    Google Scholar 

  24. IDF: International Diabetes Federation (2011) http://atlas.idf-bxl.org/content/economic-impacts-diabetes and http://www.idf.org/node/23640

  25. Inglis SC, Clark RA, McAlister FA, Ball J, Lewinter C, Cullington D, Stewart S, Cleland JGF (2010) Structured telephone support or telemonitoring programmes for patients with chronic heart failure. Cochrane Database Syst Rev 2010 8:CD007228

  26. Kolyshkina I, Wong SSW, Lim S (2004) Enhancing generalised linear models with data mining. Discussion paper. Casualty actuarial society. Arlington, Virginia. Available at: http://www.casact.org/pubs/dpp/dpp04/04dpp279.pdf

  27. Lamers LM (1999) A risk-adjuster for capitation payments based on the use of prescribed drugs. Med Care 37:824–830

    Article  Google Scholar 

  28. Lamers LM (2004) AIC and BIC—comparisons of assumptions and performance. Sociol Methods Res 33(2):188–229

    Article  Google Scholar 

  29. Liang KY, Zeger S (1986) GEE estimators. Biometrika 73(1):13–22

    Article  Google Scholar 

  30. Lorig KR, Ritter P, Stewart AL, Sobel DS, Brown WB, Bandura A, Gonzalez VM, Laurent DD, Holman HR (2001) Chronic disease self-management program: 2-year health status and health care utilization outcomes. Med Care 39(11):1217–1223

    Article  Google Scholar 

  31. MacKay D (2003) Information theory, inference and learning algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  32. Manning WG (1998) The logged dependent variable, heteroscedasticity, and the retransformation problem. J Health Econ 17:283–295

    Article  Google Scholar 

  33. Manning WG, Mullahy J (2001) Estimating log models: to transform or not to transform? J Health Econ 20:461–494

    Article  Google Scholar 

  34. McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall / CRC, London

    Book  Google Scholar 

  35. McCulloch CE, Searle SR (2001) Generalized, linear, and mixed models. Wiley, New York

    Google Scholar 

  36. Mehmud S, Winkelman R (2007) A comparative analysis of claims-based tools for health risk assessment. Tech. rep. Society of Actuaries. Available at: http://www.soa.org/files/pdf/risk-assessmentc.pdf

  37. Meyer J, Smith BM (2008) Chronic disease management: evidence of predictable savings. Tech. rep. Health management associates. Available at: http://www.idph.state.ia.us/hcr_committees/common/pdf/clinicians/savings_report.pdf

  38. Miller AJ (1990) Subset selection in regression. Chapman and Hall, New York

    Book  Google Scholar 

  39. Mullahy J (1998) Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J Health Econ 17:247–281

    Article  Google Scholar 

  40. Newhouse JP, Manning WG, Keeler EB, Sloss EM (1989) Adjusting capitation rates using objective health measures and prior utilization. Health Care Financ R 10(3):41–54

    Google Scholar 

  41. Nugent R (2008) Chronic diseases in developing countries: health and economic burdens. Ann N Y Acad Sci 1136:70–79

    Article  Google Scholar 

  42. Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245

    Article  Google Scholar 

  43. Powers CA, Meyer CM, Roebuck MC, Vaziri B (2005) Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care 43(11):1065–1072

    Article  Google Scholar 

  44. R Development Core Team (2009) R: a language and environment for statistical computing. R foundation for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0

  45. Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge

    Book  Google Scholar 

  46. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  Google Scholar 

  47. Tutz G (2000) Analyse kategorialer Daten. Oldenbourg Verlag, Munich

    Google Scholar 

  48. Tutz G, Fahrmeir L (2001) Multivariate statistical modelling based on generalized linear models. Springer, New York

    Google Scholar 

  49. Veazie PJ, Manning WG, Kane RL (2003) Improving risk adjustment for medicare capitated reimbursement using nonlinear models. Med Care 41(6):741–752

    Google Scholar 

  50. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn.Springer, Berlin

    Book  Google Scholar 

  51. Viaene S, Derrig RA, Baesens B, Dedene G (2002) A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J Risk Insur 69(3):373–421

    Article  Google Scholar 

  52. Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models and the Gauss-Newton method. Environ Res 104:402–409

    Google Scholar 

  53. Yau KW, Lee AH, Ng ASK (2002) A zero-augmented gamma mixed model for longitudinal data with many zeros. Aust N Z J Stat 44(2):177–183

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank the health insurance company concerned for providing us claims data and the three PM vendors for participating in the test.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Bayerstadler.

Appendix:Definition and interpretation of different predictive measures

Appendix:Definition and interpretation of different predictive measures

In this appendix, we introduce two kinds of predictive measures that permit a comparison of cost prediction techniques with regard to an efficient DMP selection:

  1. a)

    Measures for the accuracy of a forecast that indirectly assess how many of the patients with the highest saving potential can be identified

  2. b)

    Measures for the sorting capacity of a forecast that directly examine the same question

The latter group of measures might be more suitable for analyzing the direct cost benefit in the actual data situation described. For comparing the general ability of an approach to optimize the selection of DMP participants by forecasting claimed amounts, the measures in group a) are equally important.

  1. a)

    Measures for the accuracy of a forecast

Two predictive measures quantifying the prediction error are the mean predictive squared error (MPSE) and the mean predictive absolute error (MPAE) defined as

$$\begin{array}{rll} \text{MPSE} &:=& \frac{1}{n}\sum\limits_{i=1}^{n} \left(y_{i}^{*}-y_{o,i}\right)^{2} \quad \text{and} \\ \text{MPAE} &:=& \frac{1}{n}\sum\limits_{i=1}^{n} \left|y_{i}^{*}-y_{o,i}\right|. \end{array} $$

Both measures analyze the differences between predicted \((\boldsymbol{y}^* = (y_{1}^{*},\ldots,y_{n}^{*})')\) and actually observed (\(\boldsymbol {y}_{o} = (y_{o,1},\) \(\ldots ,y_{o,n})'\)) costs. The MPSE is based on the quadratic loss function. This means that predictions that avoid extreme discrepancies between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\) are rated best. By contrast, the MPAE that is based on the absolute loss function favors predictions that are good “on average”. The MPSE assures precise predictions for high-cost members. This is very relevant for DMP selection, because not recognizing a member who will-without preventive interaction-produce exploding medical costs in the near future means that there is no possibility of realizing the individual saving potential related to a DMP.

A disadvantage of both MPSE and MPAE is that these measures are not normed like, for example, the coefficient of determination or model R-squared that measures goodness-of-fit in linear models and ranges between 0 and 1. Hence, it is desirable to define a normed measure that is bound to a limited interval of possible values and measures the absolute predictive quality of a model. For this purpose, we define the so-called predictive R-squared \(R^{2^{*}}\) according to the formulation of the model R-squared (that assesses the squared linear correlation between \(\boldsymbol {\hat {y}}\) and \(\boldsymbol {y}\)) in order to measure the squared linear correlation between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\):

$$R^{2^{*}} := \frac{\sum_{i=1}^{n} \left(y_{i}^{*}-\bar{y}^{*}\right)\left(y_{o,i}-\bar{y}_{o}\right)}{\sqrt{\sum_{i=1}^{n} \left(y_{i}^{*}-\bar{y}^{*}\right)^{2}} \sqrt{\sum_{i=1}^{n} \left(y_{o,i}-\bar{y}_{o}\right)^{2}}}. $$

\(\bar {y}^{*}\) and \(\bar {y}_{o}\) denote the arithmetic means of \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\), respectively. Unlike the model R-squared, the predictive R-squared does not measure goodness-of-fit and cannot be interpreted as percentage of explained variance or deviance in the classical sense, since the decomposition of variance or deviance [10, 48] that holds for estimated values \(\boldsymbol {\hat {y}}\) does not hold for predicted values \(\hat {\boldsymbol {y}}^{*}\). However, \(R^{2^{*}}\) is also bound to the interval \([0;1]\) with values closer to one indicating a higher absolute predictive quality.

  1. b)

    Measures for the sorting capacity of a forecast

Two measures directly characterizing the sorting capacity of a forecast are the Spearman rank correlation coefficient \(R_{\text {Sp}}\) and the area under the “matching curve” \(AUC_{m}\).

The Spearman or rank correlation coefficient \(R_{\text {Sp}}\) measures the monotone correlation between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\):

$$R_{\text{Sp}} = \frac{\sum_{i=1}^{n} \left(\text{rank}\left(y_{i}^{*}\right)-\overline{\text{rank}}(\boldsymbol{y}^{*})\right)(\text{rank}(y_{o,i})-\overline{\text{rank}}(\boldsymbol{y}_o))}{\sqrt{\sum_{i=1}^{n} \left(\text{rank}\left(y_{i}^{*}\right)-\overline{\text{rank}}(\boldsymbol{y}^{*})\right)^{2}} \sqrt{\sum_{i=1}^{n} \left(\text{rank}(y_{o,i})-\overline{\text{rank}}(\boldsymbol{y}_o)\right)^{2}}}. $$

where \(\overline {\text {rank}}(\boldsymbol {y})\) denotes the average rank of all elements of the vector \(\boldsymbol {y}\). \(R_{\text {Sp}}\) ranges between \(-\)1 and \(+1\) where values close to \(+1\) indicate a high positive monotone correlation meaning that predicted and observed claimed amounts are similarly ordered. For DMP selection, this is a desirable property.

The idea of the matching curve m is derived from the concept of the ROC (receiver operating characteristic) curve that is used to assess the predictive quality of binary regression models [42]. \(m(i)\) is defined as the percentage of those i members with the highest observed values who can also be found among the i members with the highest predicted values where i runs from 1 to n:

$$m(i) := \frac{1}{i} \sum\limits_{j=1}^i I\left(c_{o,(j)} \in c_{(1)}^*,\ldots,c_{(i)}^*\right), \quad i=1,\ldots,n.$$

In this definition, \(c_{o,(1)},\ldots ,c_{o,(n)}\) represents the vector of member codes sorted in descending order by the corresponding observed claims totals \(\boldsymbol {y}_{o}\). In parallel, \(c_{(1)}^{*},\ldots ,c_{(n)}^{*}\) denotes the vector of member codes sorted in descending order by the corresponding predicted claims totals \(\boldsymbol {y}^{*}\). \(I(\cdot )\) is an indicator function that is equal to 1 if the condition in brackets is fulfilled and 0 if it is not. Thus, \(m(i)\) indicates the percentage of matching member codes (matches) among the first i elements of the vectors \(c_{o,(1)},\ldots ,c_{o,(n)}\) and \(c_{(1)}^{*},\ldots ,c_{(n)}^{*}\). We obtain the area under the matching curve \(AUC_{m}\) by calculating \(\frac {1}{n} \sum _{i=1}^{n} m(i)\). The maximum \(AUC_{m}\) is 1, which occurs if the members have the same order in respect of observed and predicted values.

Figure 8 shows an example of a matching curve \(m(i)\) for our GLM (evaluated on the grid i = 50,100,. . . , 9,150) and the expected matching curve of a randomly ordered sample. In the context of DMP selection, it is particularly interesting to compare different values of the matching curve \(m(i)\) in the high-cost region in order to measure what percentage of i members who actually have the highest claimed amounts can be identified by the prediction approach. This is why \(m(i)\) is also called the identification or hit ratio. Based on the identification ratio and some experience-driven assumptions on the average saving potential per cost group, a health insurer can easily compare the potential overall savings between different methods.

Fig. 8
figure 8

Matching curve m of our GLM (solid line) with area under curve \(AUC_{m}\) (gray) and expected matching curve of a randomly ordered sample (dashed line)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bayerstadler, A., Benstetter, F., Heumann, C. et al. A predictive modeling approach to increasing the economic effectiveness of disease management programs. Health Care Manag Sci 17, 284–301 (2014). https://doi.org/10.1007/s10729-013-9246-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10729-013-9246-y

Keywords

Navigation