# Present bias and health

## Abstract

This study uses a dynamic discrete choice model to examine the degree of present bias and naivete about present bias in individuals’ health care decisions. Clinical guidelines exist for several common chronic diseases. Although the empirical evidence for some guidelines is strong, many individuals with these diseases do not follow the guidelines. Using persons with diabetes as a case study, we find evidence of substantial present bias and naivete. Counterfactual simulations indicate the importance of present bias and naivete in explaining low adherence rates to health care guidelines.

This is a preview of subscription content, log in to check access.

1. 1.

Fang and Wang (2015) develop methodologies for both finite and infinite time horizons. Here, like the empirical analysis in that paper, we assume a finite horizon with a maximum age of 100.

2. 2.

For a formal discussion and technical details of the identification and estimation method, see Fang and Wang (2015).

3. 3.

We know from the standard theories of discrete choice that we have to normalize the utility for the reference alternative to 0, so without loss of generality we set $$u_{0}^{\ast } \left (x,\mathbf {\varepsilon } \right ) = 0$$ for all $$x\in \mathcal {X}$$.

4. 4.

Conditional Independence Assumption:

$$\begin{array}{@{}rcl@{}} \pi (x_{t + 1},\mathbf{\varepsilon}_{t + 1}|x_{t},\varepsilon_{t},d_{t}) &=&q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1})\pi (x_{t + 1}|x_{t},d_{t}) \\ q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1}) &=&q(\mathbf{\varepsilon} ). \end{array}$$

Extreme Value Distribution Assumption: εt is i.i.d Type I extreme value distributed.

5. 5.

Formally, the Exclusion Restriction assumption says that there exist state variables $$x_{1}\in \mathcal {X}$$ and $$x_{2}\in \mathcal {X}\$$with x1x2, such that (1) for all $$i\in \mathcal {I},$$ ui (x1) = ui (x2); and (2) for some $$i\in \mathcal {I}$$, π (x|x1,i)≠π (x|x2,i).

6. 6.

To check the robustness of our findings, we set Adherence to one if at least two or four of the five questions were answered affirmatively. The results are qualitatively similar.

7. 7.

The total cognition summary score is generated in the RAND version of HRS. The total word recall summary variables sum the immediate and delayed word recall scores. The mental status summary sums the scores for serial 7’s, backwards counting from 20, and object, date, and President/Vice-President naming tasks. The total cognition score sums the total word recall and mental status summary scores, resulting in a range of 0-35. This score has been used in the literature as a good measure of cognitive ability.

8. 8.

In robustness checks not shown, we set LowCog to one when the score is below 10 or 15, and the results are qualitatively similar.

9. 9.

In the structural but not in the reduced form analysis presented below, we discretize LogIncome to estimate state transitions and utility and time preference; we set the person’s instantaneous utility when dead to zero.

10. 10.
11. 11.

There might be some future non-related costs. For example, increased life expectancy resulting from increased adherence could lead to an increase in the number of other diseases (e.g. Alzheimer’s disease). The increase in adherence rates therefore might not necessarily be cost-saving when considering from a societal perspective.

## References

1. Abdellaoui, M., Attema, A.E., Bleichrodt, H. (2010). Intertemporal tradeoffs for gains and losses: An experimental measurement of discounted utility. The Economic Journal, 120(545), 845–866.

2. Akin, Z. (2012). Intertemporal decision making with present biased preferences. Journal of Economic Psychology, 33(1), 30–47.

3. Ali, M.K., Bullard, K.M., Saaddine, J.B., Cowie, C.C., Imperatore, G., Gregg, E.W. (2013). Achievement of goals in US diabetes care, 1999–2010. New England Journal of Medicine, 368(17), 1613–1624.

4. American Diabetes Association. (2013). Economic costs of diabetes in the US in 2012. Diabetes Care, 36(4), 1033–1046.

5. Andreoni, J., Kuhn, M.A., Sprenger, C. (2015). Measuring time preferences: A comparison of experimental methods. Journal of Economic Behavior & Organization, 116, 451–464.

6. Andreoni, J., & Sprenger, C. (2012). Estimating time preferences from convex budgets. American Economic Review, 102(7), 3333–56.

7. Arcidiacono, P., Sieg, H., Sloan, F. (2007). Living rationally under the volcano? An empirical analysis of heavy drinking and smoking. International Economic Review, 48(1), 37–65.

8. Ariely, D., & Wertenbroch, K.X. (2002). Procrastination, deadlines, and performance: Self-control by precommitment. Psychological Science, 13(3), 219–224.

9. Attema, A.E., Bleichrodt, H., L’Haridon, O., Peretti-Watel, P., Seror, V. (2018). Discounting health and money: New evidence using a more robust method. Journal of Risk and Uncertainty, 56(2), 117–140.

10. Augenblick, N., Niederle, M., Sprenger, C. (2015). Working over time: Dynamic inconsistency in real effort tasks. The Quarterly Journal of Economics, 130 (3), 1067–1115.

11. Baicker, K., Mullainathan, S., Schwartzstein, J. (2015). Behavioral hazard in health insurance. The Quarterly Journal of Economics, 130(4), 1623–1667.

12. Bickel, W.K., Odum, A.L., Madden, G.J. (1999). Impulsivity and cigarette smoking: Delay discounting in current, never, and ex-smokers. Psychopharmacology, 146(4), 447–454.

13. Bleichrodt, H., Gao, Y., Rohde, K.I. (2016). A measurement of decreasing impatience for health and money. Journal of Risk and Uncertainty, 52(3), 213–231.

14. Boyle, J.P., Honeycutt, A.A., Narayan, K.V., Hoerger, T.J., Geiss, L.S., Chen, H., Thompson, T.J. (2001). Projection of diabetes burden through 2050: Impact of changing demography and disease prevalence in the US. Diabetes Care, 24(11), 1936–1940.

15. Bradford, D., Courtemanche, C., Heutel, G., McAlvanah, P., Ruhm, C. (2017). Time preferences and consumer behavior. Journal of Risk and Uncertainty, 55(2-3), 119–145.

16. Cavagnaro, D.R., Aranovich, G.J., McClure, S.M., Pitt, M.A., Myung, J.I. (2016). On the functional form of temporal discounting: an optimized adaptive test. Journal of Risk and Uncertainty, 52(3), 233–254.

17. Chapman, G.B. (1996). Temporal discounting and utility for health and money. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 771.

18. Chen, Y., Sloan, F.A., Yashkin, A.P. (2015). Adherence to diabetes guidelines for screening, physical activity and medication and onset of complications and death. Journal of Diabetes and its Complications, 29(8), 1228–1233.

19. Chung, D.J., Steenburgh, T., Sudhir, K. (2013). Do bonuses enhance sales productivity? A dynamic structural analysis of bonus-based compensation plans. Marketing Science, 33(2), 165–187.

20. Courtemanche, C., Heutel, G., McAlvanah, P. (2015). Impatience, incentives and obesity. The Economic Journal, 125(582), 1–31.

21. Cowie, C.C., Rust, K.F., Byrd-Holt, D.D., Eberhardt, M.S., Flegal, K.M., Engelgau, M.M., Saydah, S.H., Williams, D.E., Geiss, L.S., Gregg, E.W. (2006). Prevalence of diabetes and impaired fasting glucose in adults in the US population: National Health And Nutrition Examination Survey 1999–2002. Diabetes Care, 29(6), 1263–1268.

22. DellaVigna, S., & Malmendier, U. (2006). Paying not to go to the gym. American Economic Review, 96(3), 694–719.

23. DiMatteo, M.R. (2004). Variations in patients’ adherence to medical recommendations: A quantitative review of 50 years of research. Medical Care, 42(3), 200–209.

24. Fang, H., & Wang, Y. (2015). Estimating dynamic discrete choice models with hyperbolic discounting, with an application to mammography decisions. International Economic Review, 56(2), 565–596.

25. Ferecatu, A., & Önçüler, A. (2016). Heterogeneous risk and time preferences. Journal of Risk and Uncertainty, 53(1), 1–28.

26. Gruber, J., & Köszegi, B. (2001). Is addiction “rational”? Theory and evidence. The Quarterly Journal of Economics, 116(4), 1261–1303.

27. Hinvest, N.S., & Anderson, I.M. (2010). The effects of real versus hypothetical reward on delay and probability discounting. Quarterly Journal of Experimental Psychology, 63(6), 1072–1084.

28. Ho, P.M., Rumsfeld, J.S., Masoudi, F.A., McClure, D.L., Plomondon, M.E., Steiner, J.F., Magid, D.J. (2006). Effect of medication nonadherence on hospitalization and mortality among patients with diabetes mellitus. Archives of Internal Medicine, 166(17), 1836–1841.

29. Ikeda, S., Kang, M.I., Ohtake, F. (2010). Hyperbolic discounting, the sign effect, and the body mass index. Journal of Health Economics, 29(2), 268–284.

30. Ioannou, C.A., & Sadeh, J. (2016). Time preferences and risk aversion: Tests on domain differences. Journal of Risk and Uncertainty, 53(1), 29–54.

31. Kan, K. (2007). Cigarette smoking and self-control. Journal of Health Economics, 26(1), 61–81.

32. Madrian, B.C., & Shea, D.F. (2001). The power of suggestion: Inertia in 401(k) participation and savings behavior. The Quarterly Journal of Economics, 116(4), 1149–1187.

33. Magnac, T., & Thesmar, D. (2002). Identifying dynamic discrete decision processes. Econometrica, 70(2), 801–816.

34. Mokdad, A.H., Marks, J.S., Stroup, D.F., Gerberding, J.L. (2004). Actual causes of death in the United States, 2000. Journal of the American Medical Association, 291(10), 1238–1245.

35. Norris, S.L., Engelgau, M.M., Narayan, K.V. (2001). Effectiveness of self-management training in type 2 diabetes: A systematic review of randomized controlled trials. Diabetes Care, 24(3), 561–587.

36. O’Donoghue, T., & Rabin, M. (1999). Doing it now or later. American Economic Review, 89(1), 103–124.

37. Rust, J. (1994). Structural estimation of Markov decision processes. Handbook of Econometrics, 4, 3081–3143.

38. Sloan, F.A., Bethel, M.A., Ruiz, D., Shea, A.H., Feinglos, M.N. (2008). The growing burden of diabetes mellitus in the US elderly population. Archives of Internal Medicine, 168(2), 192–199.

39. Sloan, F.A., Padrón, N.A., Platt, A.C. (2009). Preferences, beliefs, and self-management of diabetes. Health Services Research, 44(3), 1068–1087.

40. Sloan, F.A., Eldred, L. M., Xu, Y. (2014). The behavioral economics of drunk driving. Journal of Health Economics, 35, 64–81.

41. US Department of Health and Human Services. (1994). Preventing tobacco use among young people: A report of the Surgeon General. US Department of Health and Human Services.

42. Van der Pol, M., & Cairns, J. (2011). Descriptive validity of alternative intertemporal models for health outcomes: An axiomatic test. Health Economics, 20 (7), 770–782.

43. Yashkin, A.P., Hahn, P., Sloan, F.A. (2016). Introducing anti-vascular endothelial growth factor therapies for AMD did not raise risk of myocardial infarction, stroke, and death. Ophthalmology, 123(10), 2225–2231.

44. Zhuo, X., Zhang, P., Hoerger, T.J. (2013). Lifetime direct medical costs of treating type 2 diabetes and diabetic complications. American Journal of Preventive Medicine, 45(3), 253–261.

## Author information

Authors

### Corresponding author

Correspondence to Yang Wang.

Partial support for this research came from a grant from the National Institute on Aging to Duke University (NIA grant R01-AG017473).

## Appendix

### Appendix

This appendix follows Section 2 and provides more details on the identification and estimation of the dynamic discrete choice model. Specifically, given the Extreme Value Distribution assumption, the probability of action i being chosen given x, Pi,t(xt), is:

$$P_{i, t}(x_{t}) =\Pr \left[W_{i, t}\left( x_{t}\right) +\varepsilon_{i, t}\geq W_{j, t}\left( x\right) +\varepsilon_{j, t}, \forall j\neq i \right] =\frac{ \exp \left[W_{i, t}\left( x_{t}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ W_{j, t}\left( x_{t}\right) \right]}.$$
(9)

While W, defined in (4), is not observable, actual choice probabilities Pi,t (xt) are observable in the data and can be used to infer W.

With the choice-specific value function of the next-period self perceived by the current self Zi,t+ 1 (xt+ 1), defined in (5), the current self’s perception of her future self’s choice, σ, can be defined as

$$\begin{array}{@{}rcl@{}} \sigma\left( x_{t + 1},\varepsilon_{i, t + 1}\right) &=&{ arg\max_{i\in \mathcal{I}}}\left[ u_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}+{\tilde{\beta}\delta \sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2} {)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}\right] \\ &=&{arg\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\right] . \end{array}$$

Then the probability perceived by the current period self of choosing alternative i by the next period’s self when the next period’s state, again assuming an Extreme Value Distribution, is xt+ 1, $$\tilde {P}_{i, t + 1}\left (x_{t + 1}\right )$$, is:

$$\begin{array}{@{}rcl@{}} \tilde{P}_{i, t + 1}\left( x_{t + 1}\right) &=&\Pr \left[ \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) =i\right] \\ &=&\Pr \left[Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\geq Z_{j, t + 1}\left( x_{t + 1}\right) +\varepsilon_{j, t + 1},\forall j\neq i\right] \\ &=&\frac{\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 1}\right) \right]} . \end{array}$$
(10)

The distinction between $$\tilde {P}$$ and P is that the sophisticated present-biased decision-maker knows the extent of her actual future present bias; by contrast, the naive person underestimates the extent of her present bias. That is, she thinks her β is larger than it actually will be. For sophisticated persons, $$\tilde {P}= P$$; for naive ones, $$\tilde {P} \neq P$$.

With non-stationarity and a finite horizon, at t = T, when the continuation value is zero, there is no distinction among W, Z, V, and u:

$$W_{i, T}=Z_{i, T}=V_{i, T}=u_{i, T},$$

which, according to the Extreme Value Distribution assumption, leads to:

$$V_{T}=\ln {\sum\limits_{i\in \mathcal{I}}\exp [Z_{i, T}]}=\ln {\sum\limits_{i\in \mathcal{ I}}\exp [u_{i, T}]}.$$
(11)

Combining (11) and (5) yields Zi,T− 1. Given the link between Zi,T− 1 and VT− 1, using backward induction, Vt+ 1 can be determined, which in turn relates to Wi,t (4), and then to Pi,t (xt) (9). By this reasoning, we link instantaneous utility u to P, which is observable in the data. Once this relationship between u and P is established empirically from the choice probabilities (Pi,t (xt)) and transition probabilities (π (xt+ 1|xt,i)) for all $$x\in \mathcal {X}$$ and for i = (0, 1), we can estimate the utility parameters for a given $$\left \langle \beta ,\tilde {\beta },\delta \right \rangle$$.

The relationship between Zi,t and Vt can be described in three steps. First, combining (5) and (6) yields

$$V_{i, t + 1}\left( x_{t + 1}\right) =Z_{i, t + 1}\left( x_{t + 1}\right) +\left( 1- \tilde{\beta}\right) \delta {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{ )\pi (}x_{t + 2}{{|x_{t + 1}, i)}.}$$
(12)

Given (12), (7) can be rewritten as:

$$\begin{array}{@{}rcl@{}} V_{t + 1}\left( x_{t + 1}\right) &=&\mathrm{E}_{\varepsilon_{t + 1}}\left[ V_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1}\right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}\left[ \begin{array}{c} Z_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1} \\ +\left( 1-\tilde{\beta}\right) \delta {{\sum}_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(} }x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{ \varepsilon}_{t + 1}\right) )}} \end{array} \right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &&+\left( 1-\tilde{\beta}\right) \delta \mathrm{E}_{\mathbf{\varepsilon} _{t + 1}}{\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{)\pi (}x_{t + 2}{{ |x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) )}} \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &+&\left( 1-\tilde{\beta}\right) \delta \sum\limits_{i\in \mathcal{I}}\tilde{P} _{i, t + 1}\left( x_{t + 1}\right) {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}. \end{array}$$
(13)

Given the Extreme Value Distribution assumption,

$$\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}} \{Z_{i, t + 1}(x_{t + 1})+\varepsilon_{i,t + 1}\}=\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\}.$$
(14)

Combined with (14) and (10), (13) can be rewritten as

$$\begin{array}{@{}rcl@{}} &&V_{t + 1}\left( x_{t + 1}\right) =\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\} \\ &&+\left( 1-\tilde{\beta}\right) \delta {\sum}_{i\in \mathcal{I}}\frac{\exp \left[Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 2}\right) \right]} {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{(t + 2)}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}, \end{array}$$
(15)

which relates Zi,t to Vt, a relationship that makes backward induction possible.

## Rights and permissions

Reprints and Permissions