A predictive modeling approach to increasing the economic effectiveness of disease management programs

Bayerstadler, Andreas; Benstetter, Franz; Heumann, Christian; Winter, Fabian

doi:10.1007/s10729-013-9246-y

A predictive modeling approach to increasing the economic effectiveness of disease management programs

Published: 19 June 2013

Volume 17, pages 284–301, (2014)
Cite this article

Health Care Management Science Aims and scope Submit manuscript

Andreas Bayerstadler¹,
Franz Benstetter¹,
Christian Heumann² &
…
Fabian Winter¹

Abstract

Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-)effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Partial Least Squares Structural Equation Modeling

Customer profiling, segmentation, and sales prediction using AI in direct marketing

Article Open access 23 December 2023

Comparing different supervised machine learning algorithms for disease prediction

Article Open access 21 December 2019

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Article Google Scholar
Antonio K, Beirlant J (2007) Actuarial statistics with generalized linear mixed models. Insur Math Econ 40(1):58–76
Article Google Scholar
Belitz C, Brezger A, Kneib T, Lang S (2009) BayesX—software for Bayesian inference in structured additive regression models, version 2.0.1. Erhältlich unter: http://www.stat.uni-muenchen.de/~bayesx
Billings J, Mijanovich T (2007) Improving the management of care for high-cost medicaid patients. Health Aff 26(6):1643–1655
Article Google Scholar
Blough DK, Madden CW, Hornbrook MC (1999) Modeling risk using generalized linear models. J Health Econ 18:153–171
Article Google Scholar
Bodenheimer T, Lorig K, Holman H, Grumbach K (2002) Patient self-management of chronic disease in primary care. J Am Med Assoc 288(19):2469–2475
Article Google Scholar
Breiman L (1984) Classification and regression trees. Chapman & Hall/CRC, London
Google Scholar
Buntin MB, Zaslavsky AM (2004) Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. J Health Econ 23:525–542
Article Google Scholar
Cameron AC, Trivedi PK (1998) Regression analysis of count data. Cambridge University Press, New York
Book Google Scholar
Davison AC (2003) Statistical models. Cambridge University Press, New York
Book Google Scholar
De Jong P, Heller GZ (2008) Generalized linear models for insurance data. Cambridge University Press, Cambridge
Book Google Scholar
Diehr P, Yanez D, Ash A, Hornbrook M, Lin DY (1999) Methods for analyzing health care utilization and costs. Ann Rev Public Health 20:125–144
Article Google Scholar
Duan N, Manning WG, Morris CN, Newhouse JP (1983) A comparison of alternative models for the demand for medical care. J Bus Econ Stat 1(2):115–126
Google Scholar
Fahrmeir L, Kneib T (2010) Bayesian smoothing and regression for longitudinal, spatial and event history data. Oxford University Press, London
Google Scholar
Francis L (2001) Neural networks demystified. Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/01wforum/01wf253.pdf
Francis L (2003) Martian chronicles: is MARS better than neural networks? Tech. rep. Casualty actuarial society forum. Available at: http://casualtyactuarialsociety.com/pubs/forum/03wforum/03wf027.pdf
Freeman R, Lybecker KM, Taylor DW (2011) The effectiveness of disease management programs in the medicaid population. Tech. rep. The Cameron Institute. Available at: http://cameroninstitute.com
Frees EW, Valdez EA (2008) Hierarchical insurance claims modeling. J Am Stat Assoc 103(484):1457–1469
Article Google Scholar
Frees EW, Young VR, Luo Y (1999) A longitudinal data analysis interpretation of credibility models. Insur Math Econ 24:229–247
Article Google Scholar
Freitag AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer Verlag, Berlin
Book Google Scholar
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–141
Article Google Scholar
Good PI (2005) Introduction to statistics through resampling methods and R/S-PLUS. Wiley, New York
Book Google Scholar
Haberman S, Renshaw AE (1998) Actuarial applications of generalized linear models. In: Hand D, Jacka S (eds) Statistics in finance. Arnold, E., London
Google Scholar
IDF: International Diabetes Federation (2011) http://atlas.idf-bxl.org/content/economic-impacts-diabetes and http://www.idf.org/node/23640
Inglis SC, Clark RA, McAlister FA, Ball J, Lewinter C, Cullington D, Stewart S, Cleland JGF (2010) Structured telephone support or telemonitoring programmes for patients with chronic heart failure. Cochrane Database Syst Rev 2010 8:CD007228
Kolyshkina I, Wong SSW, Lim S (2004) Enhancing generalised linear models with data mining. Discussion paper. Casualty actuarial society. Arlington, Virginia. Available at: http://www.casact.org/pubs/dpp/dpp04/04dpp279.pdf
Lamers LM (1999) A risk-adjuster for capitation payments based on the use of prescribed drugs. Med Care 37:824–830
Article Google Scholar
Lamers LM (2004) AIC and BIC—comparisons of assumptions and performance. Sociol Methods Res 33(2):188–229
Article Google Scholar
Liang KY, Zeger S (1986) GEE estimators. Biometrika 73(1):13–22
Article Google Scholar
Lorig KR, Ritter P, Stewart AL, Sobel DS, Brown WB, Bandura A, Gonzalez VM, Laurent DD, Holman HR (2001) Chronic disease self-management program: 2-year health status and health care utilization outcomes. Med Care 39(11):1217–1223
Article Google Scholar
MacKay D (2003) Information theory, inference and learning algorithms. Cambridge University Press, Cambridge
Google Scholar
Manning WG (1998) The logged dependent variable, heteroscedasticity, and the retransformation problem. J Health Econ 17:283–295
Article Google Scholar
Manning WG, Mullahy J (2001) Estimating log models: to transform or not to transform? J Health Econ 20:461–494
Article Google Scholar
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall / CRC, London
Book Google Scholar
McCulloch CE, Searle SR (2001) Generalized, linear, and mixed models. Wiley, New York
Google Scholar
Mehmud S, Winkelman R (2007) A comparative analysis of claims-based tools for health risk assessment. Tech. rep. Society of Actuaries. Available at: http://www.soa.org/files/pdf/risk-assessmentc.pdf
Meyer J, Smith BM (2008) Chronic disease management: evidence of predictable savings. Tech. rep. Health management associates. Available at: http://www.idph.state.ia.us/hcr_committees/common/pdf/clinicians/savings_report.pdf
Miller AJ (1990) Subset selection in regression. Chapman and Hall, New York
Book Google Scholar
Mullahy J (1998) Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J Health Econ 17:247–281
Article Google Scholar
Newhouse JP, Manning WG, Keeler EB, Sloss EM (1989) Adjusting capitation rates using objective health measures and prior utilization. Health Care Financ R 10(3):41–54
Google Scholar
Nugent R (2008) Chronic diseases in developing countries: health and economic burdens. Ann N Y Acad Sci 1136:70–79
Article Google Scholar
Pearce J, Ferrier S (2000) Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model 133:225–245
Article Google Scholar
Powers CA, Meyer CM, Roebuck MC, Vaziri B (2005) Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care 43(11):1065–1072
Article Google Scholar
R Development Core Team (2009) R: a language and environment for statistical computing. R foundation for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Book Google Scholar
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article Google Scholar
Tutz G (2000) Analyse kategorialer Daten. Oldenbourg Verlag, Munich
Google Scholar
Tutz G, Fahrmeir L (2001) Multivariate statistical modelling based on generalized linear models. Springer, New York
Google Scholar
Veazie PJ, Manning WG, Kane RL (2003) Improving risk adjustment for medicare capitated reimbursement using nonlinear models. Med Care 41(6):741–752
Google Scholar
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn.Springer, Berlin
Book Google Scholar
Viaene S, Derrig RA, Baesens B, Dedene G (2002) A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J Risk Insur 69(3):373–421
Article Google Scholar
Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models and the Gauss-Newton method. Environ Res 104:402–409
Google Scholar
Yau KW, Lee AH, Ng ASK (2002) A zero-augmented gamma mixed model for longitudinal data with many zeros. Aust N Z J Stat 44(2):177–183
Article Google Scholar

Download references

Acknowledgments

We would like to thank the health insurance company concerned for providing us claims data and the three PM vendors for participating in the test.

Author information

Authors and Affiliations

Munich Health, Munich Re, Königinstraße 107, 80802, Munich, Germany
Andreas Bayerstadler, Franz Benstetter & Fabian Winter
Institute of Statistics, Ludwig-Maximilians-Universität München, Ludwigstraße 33, 80539, Munich, Germany
Christian Heumann

Authors

Andreas Bayerstadler
View author publications
You can also search for this author in PubMed Google Scholar
Franz Benstetter
View author publications
You can also search for this author in PubMed Google Scholar
Christian Heumann
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Winter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Bayerstadler.

Appendix:Definition and interpretation of different predictive measures

In this appendix, we introduce two kinds of predictive measures that permit a comparison of cost prediction techniques with regard to an efficient DMP selection:

a)
Measures for the accuracy of a forecast that indirectly assess how many of the patients with the highest saving potential can be identified
b)
Measures for the sorting capacity of a forecast that directly examine the same question

The latter group of measures might be more suitable for analyzing the direct cost benefit in the actual data situation described. For comparing the general ability of an approach to optimize the selection of DMP participants by forecasting claimed amounts, the measures in group a) are equally important.

a)
Measures for the accuracy of a forecast

Two predictive measures quantifying the prediction error are the mean predictive squared error (MPSE) and the mean predictive absolute error (MPAE) defined as

$$\begin{array}{rll} \text{MPSE} &:=& \frac{1}{n}\sum\limits_{i=1}^{n} \left(y_{i}^{*}-y_{o,i}\right)^{2} \quad \text{and} \\ \text{MPAE} &:=& \frac{1}{n}\sum\limits_{i=1}^{n} \left|y_{i}^{*}-y_{o,i}\right|. \end{array} $$

Both measures analyze the differences between predicted $(\boldsymbol{y}^* = (y_{1}^{*},\ldots,y_{n}^{*})')$ and actually observed ($\boldsymbol {y}_{o} = (y_{o,1},$ $\ldots ,y_{o,n})'$) costs. The MPSE is based on the quadratic loss function. This means that predictions that avoid extreme discrepancies between $\boldsymbol {y}^{*}$ and $\boldsymbol {y}_{o}$ are rated best. By contrast, the MPAE that is based on the absolute loss function favors predictions that are good “on average”. The MPSE assures precise predictions for high-cost members. This is very relevant for DMP selection, because not recognizing a member who will-without preventive interaction-produce exploding medical costs in the near future means that there is no possibility of realizing the individual saving potential related to a DMP.

A disadvantage of both MPSE and MPAE is that these measures are not normed like, for example, the coefficient of determination or model R-squared that measures goodness-of-fit in linear models and ranges between 0 and 1. Hence, it is desirable to define a normed measure that is bound to a limited interval of possible values and measures the absolute predictive quality of a model. For this purpose, we define the so-called predictive R-squared $R^{2^{*}}$ according to the formulation of the model R-squared (that assesses the squared linear correlation between $\boldsymbol {\hat {y}}$ and $\boldsymbol {y}$) in order to measure the squared linear correlation between $\boldsymbol {y}^{*}$ and $\boldsymbol {y}_{o}$:

$$R^{2^{*}} := \frac{\sum_{i=1}^{n} \left(y_{i}^{*}-\bar{y}^{*}\right)\left(y_{o,i}-\bar{y}_{o}\right)}{\sqrt{\sum_{i=1}^{n} \left(y_{i}^{*}-\bar{y}^{*}\right)^{2}} \sqrt{\sum_{i=1}^{n} \left(y_{o,i}-\bar{y}_{o}\right)^{2}}}. $$

$\bar {y}^{*}$ and $\bar {y}_{o}$ denote the arithmetic means of $\boldsymbol {y}^{*}$ and $\boldsymbol {y}_{o}$, respectively. Unlike the model R-squared, the predictive R-squared does not measure goodness-of-fit and cannot be interpreted as percentage of explained variance or deviance in the classical sense, since the decomposition of variance or deviance [10, 48] that holds for estimated values $\boldsymbol {\hat {y}}$ does not hold for predicted values $\hat {\boldsymbol {y}}^{*}$. However, $R^{2^{*}}$ is also bound to the interval $[0;1]$ with values closer to one indicating a higher absolute predictive quality.

b)
Measures for the sorting capacity of a forecast

Two measures directly characterizing the sorting capacity of a forecast are the Spearman rank correlation coefficient $R_{\text {Sp}}$ and the area under the “matching curve” $AUC_{m}$.

The Spearman or rank correlation coefficient $R_{\text {Sp}}$ measures the monotone correlation between $\boldsymbol {y}^{*}$ and $\boldsymbol {y}_{o}$:

$$R_{\text{Sp}} = \frac{\sum_{i=1}^{n} \left(\text{rank}\left(y_{i}^{*}\right)-\overline{\text{rank}}(\boldsymbol{y}^{*})\right)(\text{rank}(y_{o,i})-\overline{\text{rank}}(\boldsymbol{y}_o))}{\sqrt{\sum_{i=1}^{n} \left(\text{rank}\left(y_{i}^{*}\right)-\overline{\text{rank}}(\boldsymbol{y}^{*})\right)^{2}} \sqrt{\sum_{i=1}^{n} \left(\text{rank}(y_{o,i})-\overline{\text{rank}}(\boldsymbol{y}_o)\right)^{2}}}. $$

where $\overline {\text {rank}}(\boldsymbol {y})$ denotes the average rank of all elements of the vector $\boldsymbol {y}$. $R_{\text {Sp}}$ ranges between $-$1 and $+1$ where values close to $+1$ indicate a high positive monotone correlation meaning that predicted and observed claimed amounts are similarly ordered. For DMP selection, this is a desirable property.

The idea of the matching curve m is derived from the concept of the ROC (receiver operating characteristic) curve that is used to assess the predictive quality of binary regression models [42]. $m(i)$ is defined as the percentage of those i members with the highest observed values who can also be found among the i members with the highest predicted values where i runs from 1 to n:

$$m(i) := \frac{1}{i} \sum\limits_{j=1}^i I\left(c_{o,(j)} \in c_{(1)}^*,\ldots,c_{(i)}^*\right), \quad i=1,\ldots,n.$$

In this definition, $c_{o,(1)},\ldots ,c_{o,(n)}$ represents the vector of member codes sorted in descending order by the corresponding observed claims totals $\boldsymbol {y}_{o}$. In parallel, $c_{(1)}^{*},\ldots ,c_{(n)}^{*}$ denotes the vector of member codes sorted in descending order by the corresponding predicted claims totals $\boldsymbol {y}^{*}$. $I(\cdot )$ is an indicator function that is equal to 1 if the condition in brackets is fulfilled and 0 if it is not. Thus, $m(i)$ indicates the percentage of matching member codes (matches) among the first i elements of the vectors $c_{o,(1)},\ldots ,c_{o,(n)}$ and $c_{(1)}^{*},\ldots ,c_{(n)}^{*}$. We obtain the area under the matching curve $AUC_{m}$ by calculating $\frac {1}{n} \sum _{i=1}^{n} m(i)$. The maximum $AUC_{m}$ is 1, which occurs if the members have the same order in respect of observed and predicted values.

Figure 8 shows an example of a matching curve $m(i)$ for our GLM (evaluated on the grid i = 50,100,. . . , 9,150) and the expected matching curve of a randomly ordered sample. In the context of DMP selection, it is particularly interesting to compare different values of the matching curve $m(i)$ in the high-cost region in order to measure what percentage of i members who actually have the highest claimed amounts can be identified by the prediction approach. This is why $m(i)$ is also called the identification or hit ratio. Based on the identification ratio and some experience-driven assumptions on the average saving potential per cost group, a health insurer can easily compare the potential overall savings between different methods.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bayerstadler, A., Benstetter, F., Heumann, C. et al. A predictive modeling approach to increasing the economic effectiveness of disease management programs. Health Care Manag Sci 17, 284–301 (2014). https://doi.org/10.1007/s10729-013-9246-y

Download citation

Received: 08 October 2012
Accepted: 29 May 2013
Published: 19 June 2013
Issue Date: September 2014
DOI: https://doi.org/10.1007/s10729-013-9246-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A predictive modeling approach to increasing the economic effectiveness of disease management programs

Abstract

Access this article

Similar content being viewed by others

Partial Least Squares Structural Equation Modeling

Customer profiling, segmentation, and sales prediction using AI in direct marketing

Comparing different supervised machine learning algorithms for disease prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix:Definition and interpretation of different predictive measures

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A predictive modeling approach to increasing the economic effectiveness of disease management programs

Abstract

Access this article

Similar content being viewed by others

Partial Least Squares Structural Equation Modeling

Customer profiling, segmentation, and sales prediction using AI in direct marketing

Comparing different supervised machine learning algorithms for disease prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix:Definition and interpretation of different predictive measures

Appendix:Definition and interpretation of different predictive measures

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation