Abstract
Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-)effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.
Similar content being viewed by others
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Antonio K, Beirlant J (2007) Actuarial statistics with generalized linear mixed models. Insur Math Econ 40(1):58–76
Belitz C, Brezger A, Kneib T, Lang S (2009) BayesX—software for Bayesian inference in structured additive regression models, version 2.0.1. Erhältlich unter: http://www.stat.uni-muenchen.de/~bayesx
Billings J, Mijanovich T (2007) Improving the management of care for high-cost medicaid patients. Health Aff 26(6):1643–1655
Blough DK, Madden CW, Hornbrook MC (1999) Modeling risk using generalized linear models. J Health Econ 18:153–171
Bodenheimer T, Lorig K, Holman H, Grumbach K (2002) Patient self-management of chronic disease in primary care. J Am Med Assoc 288(19):2469–2475
Breiman L (1984) Classification and regression trees. Chapman & Hall/CRC, London
Buntin MB, Zaslavsky AM (2004) Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J Health Econ 23:525–542
Cameron AC, Trivedi PK (1998) Regression analysis of count data. Cambridge University Press, New York
Davison AC (2003) Statistical models. Cambridge University Press, New York
De Jong P, Heller GZ (2008) Generalized linear models for insurance data. Cambridge University Press, Cambridge
Diehr P, Yanez D, Ash A, Hornbrook M, Lin DY (1999) Methods for analyzing health care utilization and costs. Ann Rev Public Health 20:125–144
Duan N, Manning WG, Morris CN, Newhouse JP (1983) A comparison of alternative models for the demand for medical care. J Bus Econ Stat 1(2):115–126
Fahrmeir L, Kneib T (2010) Bayesian smoothing and regression for longitudinal, spatial and event history data. Oxford University Press, London
Francis L (2001) Neural networks demystified. Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/01wforum/01wf253.pdf
Francis L (2003) Martian chronicles: is MARS better than neural networks? Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/03wforum/03wf027.pdf
Freeman R, Lybecker KM, Taylor DW (2011) The effectiveness of disease management programs in the medicaid population. Tech. rep. The Cameron Institute. Available at: http://cameroninstitute.com
Frees EW, Valdez EA (2008) Hierarchical insurance claims modeling. J Am Stat Assoc 103(484):1457–1469
Frees EW, Young VR, Luo Y (1999) A longitudinal data analysis interpretation of credibility models. Insur Math Econ 24:229–247
Freitag AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer Verlag, Berlin
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–141
Good PI (2005) Introduction to statistics through resampling methods and R/S-PLUS. Wiley, New York
Haberman S, Renshaw AE (1998) Actuarial applications of generalized linear models. In: Hand D, Jacka S (eds) Statistics in finance. Arnold, E., London
IDF: International Diabetes Federation (2011) http://atlas.idf-bxl.org/content/economic-impacts-diabetes and http://www.idf.org/node/23640
Inglis SC, Clark RA, McAlister FA, Ball J, Lewinter C, Cullington D, Stewart S, Cleland JGF (2010) Structured telephone support or telemonitoring programmes for patients with chronic heart failure. Cochrane Database Syst Rev 2010 8:CD007228
Kolyshkina I, Wong SSW, Lim S (2004) Enhancing generalised linear models with data mining. Discussion paper. Casualty actuarial society. Arlington, Virginia. Available at: http://www.casact.org/pubs/dpp/dpp04/04dpp279.pdf
Lamers LM (1999) A risk-adjuster for capitation payments based on the use of prescribed drugs. Med Care 37:824–830
Lamers LM (2004) AIC and BIC—comparisons of assumptions and performance. Sociol Methods Res 33(2):188–229
Liang KY, Zeger S (1986) GEE estimators. Biometrika 73(1):13–22
Lorig KR, Ritter P, Stewart AL, Sobel DS, Brown WB, Bandura A, Gonzalez VM, Laurent DD, Holman HR (2001) Chronic disease self-management program: 2-year health status and health care utilization outcomes. Med Care 39(11):1217–1223
MacKay D (2003) Information theory, inference and learning algorithms. Cambridge University Press, Cambridge
Manning WG (1998) The logged dependent variable, heteroscedasticity, and the retransformation problem. J Health Econ 17:283–295
Manning WG, Mullahy J (2001) Estimating log models: to transform or not to transform? J Health Econ 20:461–494
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall / CRC, London
McCulloch CE, Searle SR (2001) Generalized, linear, and mixed models. Wiley, New York
Mehmud S, Winkelman R (2007) A comparative analysis of claims-based tools for health risk assessment. Tech. rep. Society of Actuaries. Available at: http://www.soa.org/files/pdf/risk-assessmentc.pdf
Meyer J, Smith BM (2008) Chronic disease management: evidence of predictable savings. Tech. rep. Health management associates. Available at: http://www.idph.state.ia.us/hcr_committees/common/pdf/clinicians/savings_report.pdf
Miller AJ (1990) Subset selection in regression. Chapman and Hall, New York
Mullahy J (1998) Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J Health Econ 17:247–281
Newhouse JP, Manning WG, Keeler EB, Sloss EM (1989) Adjusting capitation rates using objective health measures and prior utilization. Health Care Financ R 10(3):41–54
Nugent R (2008) Chronic diseases in developing countries: health and economic burdens. Ann N Y Acad Sci 1136:70–79
Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245
Powers CA, Meyer CM, Roebuck MC, Vaziri B (2005) Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care 43(11):1065–1072
R Development Core Team (2009) R: a language and environment for statistical computing. R foundation for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Tutz G (2000) Analyse kategorialer Daten. Oldenbourg Verlag, Munich
Tutz G, Fahrmeir L (2001) Multivariate statistical modelling based on generalized linear models. Springer, New York
Veazie PJ, Manning WG, Kane RL (2003) Improving risk adjustment for medicare capitated reimbursement using nonlinear models. Med Care 41(6):741–752
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn.Springer, Berlin
Viaene S, Derrig RA, Baesens B, Dedene G (2002) A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J Risk Insur 69(3):373–421
Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models and the Gauss-Newton method. Environ Res 104:402–409
Yau KW, Lee AH, Ng ASK (2002) A zero-augmented gamma mixed model for longitudinal data with many zeros. Aust N Z J Stat 44(2):177–183
Acknowledgments
We would like to thank the health insurance company concerned for providing us claims data and the three PM vendors for participating in the test.
Author information
Authors and Affiliations
Corresponding author
Appendix:Definition and interpretation of different predictive measures
Appendix:Definition and interpretation of different predictive measures
In this appendix, we introduce two kinds of predictive measures that permit a comparison of cost prediction techniques with regard to an efficient DMP selection:
-
a)
Measures for the accuracy of a forecast that indirectly assess how many of the patients with the highest saving potential can be identified
-
b)
Measures for the sorting capacity of a forecast that directly examine the same question
The latter group of measures might be more suitable for analyzing the direct cost benefit in the actual data situation described. For comparing the general ability of an approach to optimize the selection of DMP participants by forecasting claimed amounts, the measures in group a) are equally important.
-
a)
Measures for the accuracy of a forecast
Two predictive measures quantifying the prediction error are the mean predictive squared error (MPSE) and the mean predictive absolute error (MPAE) defined as
Both measures analyze the differences between predicted \((\boldsymbol{y}^* = (y_{1}^{*},\ldots,y_{n}^{*})')\) and actually observed (\(\boldsymbol {y}_{o} = (y_{o,1},\) \(\ldots ,y_{o,n})'\)) costs. The MPSE is based on the quadratic loss function. This means that predictions that avoid extreme discrepancies between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\) are rated best. By contrast, the MPAE that is based on the absolute loss function favors predictions that are good “on average”. The MPSE assures precise predictions for high-cost members. This is very relevant for DMP selection, because not recognizing a member who will-without preventive interaction-produce exploding medical costs in the near future means that there is no possibility of realizing the individual saving potential related to a DMP.
A disadvantage of both MPSE and MPAE is that these measures are not normed like, for example, the coefficient of determination or model R-squared that measures goodness-of-fit in linear models and ranges between 0 and 1. Hence, it is desirable to define a normed measure that is bound to a limited interval of possible values and measures the absolute predictive quality of a model. For this purpose, we define the so-called predictive R-squared \(R^{2^{*}}\) according to the formulation of the model R-squared (that assesses the squared linear correlation between \(\boldsymbol {\hat {y}}\) and \(\boldsymbol {y}\)) in order to measure the squared linear correlation between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\):
\(\bar {y}^{*}\) and \(\bar {y}_{o}\) denote the arithmetic means of \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\), respectively. Unlike the model R-squared, the predictive R-squared does not measure goodness-of-fit and cannot be interpreted as percentage of explained variance or deviance in the classical sense, since the decomposition of variance or deviance [10, 48] that holds for estimated values \(\boldsymbol {\hat {y}}\) does not hold for predicted values \(\hat {\boldsymbol {y}}^{*}\). However, \(R^{2^{*}}\) is also bound to the interval \([0;1]\) with values closer to one indicating a higher absolute predictive quality.
-
b)
Measures for the sorting capacity of a forecast
Two measures directly characterizing the sorting capacity of a forecast are the Spearman rank correlation coefficient \(R_{\text {Sp}}\) and the area under the “matching curve” \(AUC_{m}\).
The Spearman or rank correlation coefficient \(R_{\text {Sp}}\) measures the monotone correlation between \(\boldsymbol {y}^{*}\) and \(\boldsymbol {y}_{o}\):
where \(\overline {\text {rank}}(\boldsymbol {y})\) denotes the average rank of all elements of the vector \(\boldsymbol {y}\). \(R_{\text {Sp}}\) ranges between \(-\)1 and \(+1\) where values close to \(+1\) indicate a high positive monotone correlation meaning that predicted and observed claimed amounts are similarly ordered. For DMP selection, this is a desirable property.
The idea of the matching curve m is derived from the concept of the ROC (receiver operating characteristic) curve that is used to assess the predictive quality of binary regression models [42]. \(m(i)\) is defined as the percentage of those i members with the highest observed values who can also be found among the i members with the highest predicted values where i runs from 1 to n:
In this definition, \(c_{o,(1)},\ldots ,c_{o,(n)}\) represents the vector of member codes sorted in descending order by the corresponding observed claims totals \(\boldsymbol {y}_{o}\). In parallel, \(c_{(1)}^{*},\ldots ,c_{(n)}^{*}\) denotes the vector of member codes sorted in descending order by the corresponding predicted claims totals \(\boldsymbol {y}^{*}\). \(I(\cdot )\) is an indicator function that is equal to 1 if the condition in brackets is fulfilled and 0 if it is not. Thus, \(m(i)\) indicates the percentage of matching member codes (matches) among the first i elements of the vectors \(c_{o,(1)},\ldots ,c_{o,(n)}\) and \(c_{(1)}^{*},\ldots ,c_{(n)}^{*}\). We obtain the area under the matching curve \(AUC_{m}\) by calculating \(\frac {1}{n} \sum _{i=1}^{n} m(i)\). The maximum \(AUC_{m}\) is 1, which occurs if the members have the same order in respect of observed and predicted values.
Figure 8 shows an example of a matching curve \(m(i)\) for our GLM (evaluated on the grid i = 50,100,. . . , 9,150) and the expected matching curve of a randomly ordered sample. In the context of DMP selection, it is particularly interesting to compare different values of the matching curve \(m(i)\) in the high-cost region in order to measure what percentage of i members who actually have the highest claimed amounts can be identified by the prediction approach. This is why \(m(i)\) is also called the identification or hit ratio. Based on the identification ratio and some experience-driven assumptions on the average saving potential per cost group, a health insurer can easily compare the potential overall savings between different methods.
Rights and permissions
About this article
Cite this article
Bayerstadler, A., Benstetter, F., Heumann, C. et al. A predictive modeling approach to increasing the economic effectiveness of disease management programs. Health Care Manag Sci 17, 284–301 (2014). https://doi.org/10.1007/s10729-013-9246-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10729-013-9246-y