Present bias and health


This study uses a dynamic discrete choice model to examine the degree of present bias and naivete about present bias in individuals’ health care decisions. Clinical guidelines exist for several common chronic diseases. Although the empirical evidence for some guidelines is strong, many individuals with these diseases do not follow the guidelines. Using persons with diabetes as a case study, we find evidence of substantial present bias and naivete. Counterfactual simulations indicate the importance of present bias and naivete in explaining low adherence rates to health care guidelines.

This is a preview of subscription content, log in to check access.


  1. 1.

    Fang and Wang (2015) develop methodologies for both finite and infinite time horizons. Here, like the empirical analysis in that paper, we assume a finite horizon with a maximum age of 100.

  2. 2.

    For a formal discussion and technical details of the identification and estimation method, see Fang and Wang (2015).

  3. 3.

    We know from the standard theories of discrete choice that we have to normalize the utility for the reference alternative to 0, so without loss of generality we set \(u_{0}^{\ast } \left (x,\mathbf {\varepsilon } \right ) = 0\) for all \(x\in \mathcal {X}\).

  4. 4.

    Conditional Independence Assumption:

    $$\begin{array}{@{}rcl@{}} \pi (x_{t + 1},\mathbf{\varepsilon}_{t + 1}|x_{t},\varepsilon_{t},d_{t}) &=&q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1})\pi (x_{t + 1}|x_{t},d_{t}) \\ q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1}) &=&q(\mathbf{\varepsilon} ). \end{array} $$

    Extreme Value Distribution Assumption: εt is i.i.d Type I extreme value distributed.

  5. 5.

    Formally, the Exclusion Restriction assumption says that there exist state variables \(x_{1}\in \mathcal {X}\) and \(x_{2}\in \mathcal {X}\ \)with x1x2, such that (1) for all \(i\in \mathcal {I},\) ui (x1) = ui (x2); and (2) for some \(i\in \mathcal {I}\), π (x|x1,i)≠π (x|x2,i).

  6. 6.

    To check the robustness of our findings, we set Adherence to one if at least two or four of the five questions were answered affirmatively. The results are qualitatively similar.

  7. 7.

    The total cognition summary score is generated in the RAND version of HRS. The total word recall summary variables sum the immediate and delayed word recall scores. The mental status summary sums the scores for serial 7’s, backwards counting from 20, and object, date, and President/Vice-President naming tasks. The total cognition score sums the total word recall and mental status summary scores, resulting in a range of 0-35. This score has been used in the literature as a good measure of cognitive ability.

  8. 8.

    In robustness checks not shown, we set LowCog to one when the score is below 10 or 15, and the results are qualitatively similar.

  9. 9.

    In the structural but not in the reduced form analysis presented below, we discretize LogIncome to estimate state transitions and utility and time preference; we set the person’s instantaneous utility when dead to zero.

  10. 10., accessed 10/01/17.

  11. 11.

    There might be some future non-related costs. For example, increased life expectancy resulting from increased adherence could lead to an increase in the number of other diseases (e.g. Alzheimer’s disease). The increase in adherence rates therefore might not necessarily be cost-saving when considering from a societal perspective.


  1. Abdellaoui, M., Attema, A.E., Bleichrodt, H. (2010). Intertemporal tradeoffs for gains and losses: An experimental measurement of discounted utility. The Economic Journal, 120(545), 845–866.

    Article  Google Scholar 

  2. Akin, Z. (2012). Intertemporal decision making with present biased preferences. Journal of Economic Psychology, 33(1), 30–47.

    Article  Google Scholar 

  3. Ali, M.K., Bullard, K.M., Saaddine, J.B., Cowie, C.C., Imperatore, G., Gregg, E.W. (2013). Achievement of goals in US diabetes care, 1999–2010. New England Journal of Medicine, 368(17), 1613–1624.

    Article  Google Scholar 

  4. American Diabetes Association. (2013). Economic costs of diabetes in the US in 2012. Diabetes Care, 36(4), 1033–1046.

    Article  Google Scholar 

  5. Andreoni, J., Kuhn, M.A., Sprenger, C. (2015). Measuring time preferences: A comparison of experimental methods. Journal of Economic Behavior & Organization, 116, 451–464.

    Article  Google Scholar 

  6. Andreoni, J., & Sprenger, C. (2012). Estimating time preferences from convex budgets. American Economic Review, 102(7), 3333–56.

    Article  Google Scholar 

  7. Arcidiacono, P., Sieg, H., Sloan, F. (2007). Living rationally under the volcano? An empirical analysis of heavy drinking and smoking. International Economic Review, 48(1), 37–65.

    Article  Google Scholar 

  8. Ariely, D., & Wertenbroch, K.X. (2002). Procrastination, deadlines, and performance: Self-control by precommitment. Psychological Science, 13(3), 219–224.

    Article  Google Scholar 

  9. Attema, A.E., Bleichrodt, H., L’Haridon, O., Peretti-Watel, P., Seror, V. (2018). Discounting health and money: New evidence using a more robust method. Journal of Risk and Uncertainty, 56(2), 117–140.

    Article  Google Scholar 

  10. Augenblick, N., Niederle, M., Sprenger, C. (2015). Working over time: Dynamic inconsistency in real effort tasks. The Quarterly Journal of Economics, 130 (3), 1067–1115.

    Article  Google Scholar 

  11. Baicker, K., Mullainathan, S., Schwartzstein, J. (2015). Behavioral hazard in health insurance. The Quarterly Journal of Economics, 130(4), 1623–1667.

    Article  Google Scholar 

  12. Bickel, W.K., Odum, A.L., Madden, G.J. (1999). Impulsivity and cigarette smoking: Delay discounting in current, never, and ex-smokers. Psychopharmacology, 146(4), 447–454.

    Article  Google Scholar 

  13. Bleichrodt, H., Gao, Y., Rohde, K.I. (2016). A measurement of decreasing impatience for health and money. Journal of Risk and Uncertainty, 52(3), 213–231.

    Article  Google Scholar 

  14. Boyle, J.P., Honeycutt, A.A., Narayan, K.V., Hoerger, T.J., Geiss, L.S., Chen, H., Thompson, T.J. (2001). Projection of diabetes burden through 2050: Impact of changing demography and disease prevalence in the US. Diabetes Care, 24(11), 1936–1940.

    Article  Google Scholar 

  15. Bradford, D., Courtemanche, C., Heutel, G., McAlvanah, P., Ruhm, C. (2017). Time preferences and consumer behavior. Journal of Risk and Uncertainty, 55(2-3), 119–145.

    Article  Google Scholar 

  16. Cavagnaro, D.R., Aranovich, G.J., McClure, S.M., Pitt, M.A., Myung, J.I. (2016). On the functional form of temporal discounting: an optimized adaptive test. Journal of Risk and Uncertainty, 52(3), 233–254.

    Article  Google Scholar 

  17. Chapman, G.B. (1996). Temporal discounting and utility for health and money. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 771.

    Google Scholar 

  18. Chen, Y., Sloan, F.A., Yashkin, A.P. (2015). Adherence to diabetes guidelines for screening, physical activity and medication and onset of complications and death. Journal of Diabetes and its Complications, 29(8), 1228–1233.

    Article  Google Scholar 

  19. Chung, D.J., Steenburgh, T., Sudhir, K. (2013). Do bonuses enhance sales productivity? A dynamic structural analysis of bonus-based compensation plans. Marketing Science, 33(2), 165–187.

    Article  Google Scholar 

  20. Courtemanche, C., Heutel, G., McAlvanah, P. (2015). Impatience, incentives and obesity. The Economic Journal, 125(582), 1–31.

    Article  Google Scholar 

  21. Cowie, C.C., Rust, K.F., Byrd-Holt, D.D., Eberhardt, M.S., Flegal, K.M., Engelgau, M.M., Saydah, S.H., Williams, D.E., Geiss, L.S., Gregg, E.W. (2006). Prevalence of diabetes and impaired fasting glucose in adults in the US population: National Health And Nutrition Examination Survey 1999–2002. Diabetes Care, 29(6), 1263–1268.

    Article  Google Scholar 

  22. DellaVigna, S., & Malmendier, U. (2006). Paying not to go to the gym. American Economic Review, 96(3), 694–719.

    Article  Google Scholar 

  23. DiMatteo, M.R. (2004). Variations in patients’ adherence to medical recommendations: A quantitative review of 50 years of research. Medical Care, 42(3), 200–209.

    Article  Google Scholar 

  24. Fang, H., & Wang, Y. (2015). Estimating dynamic discrete choice models with hyperbolic discounting, with an application to mammography decisions. International Economic Review, 56(2), 565–596.

    Article  Google Scholar 

  25. Ferecatu, A., & Önçüler, A. (2016). Heterogeneous risk and time preferences. Journal of Risk and Uncertainty, 53(1), 1–28.

    Article  Google Scholar 

  26. Gruber, J., & Köszegi, B. (2001). Is addiction “rational”? Theory and evidence. The Quarterly Journal of Economics, 116(4), 1261–1303.

    Article  Google Scholar 

  27. Hinvest, N.S., & Anderson, I.M. (2010). The effects of real versus hypothetical reward on delay and probability discounting. Quarterly Journal of Experimental Psychology, 63(6), 1072–1084.

    Article  Google Scholar 

  28. Ho, P.M., Rumsfeld, J.S., Masoudi, F.A., McClure, D.L., Plomondon, M.E., Steiner, J.F., Magid, D.J. (2006). Effect of medication nonadherence on hospitalization and mortality among patients with diabetes mellitus. Archives of Internal Medicine, 166(17), 1836–1841.

    Article  Google Scholar 

  29. Ikeda, S., Kang, M.I., Ohtake, F. (2010). Hyperbolic discounting, the sign effect, and the body mass index. Journal of Health Economics, 29(2), 268–284.

    Article  Google Scholar 

  30. Ioannou, C.A., & Sadeh, J. (2016). Time preferences and risk aversion: Tests on domain differences. Journal of Risk and Uncertainty, 53(1), 29–54.

    Article  Google Scholar 

  31. Kan, K. (2007). Cigarette smoking and self-control. Journal of Health Economics, 26(1), 61–81.

    Article  Google Scholar 

  32. Madrian, B.C., & Shea, D.F. (2001). The power of suggestion: Inertia in 401(k) participation and savings behavior. The Quarterly Journal of Economics, 116(4), 1149–1187.

    Article  Google Scholar 

  33. Magnac, T., & Thesmar, D. (2002). Identifying dynamic discrete decision processes. Econometrica, 70(2), 801–816.

    Article  Google Scholar 

  34. Mokdad, A.H., Marks, J.S., Stroup, D.F., Gerberding, J.L. (2004). Actual causes of death in the United States, 2000. Journal of the American Medical Association, 291(10), 1238–1245.

    Article  Google Scholar 

  35. Norris, S.L., Engelgau, M.M., Narayan, K.V. (2001). Effectiveness of self-management training in type 2 diabetes: A systematic review of randomized controlled trials. Diabetes Care, 24(3), 561–587.

    Article  Google Scholar 

  36. O’Donoghue, T., & Rabin, M. (1999). Doing it now or later. American Economic Review, 89(1), 103–124.

    Article  Google Scholar 

  37. Rust, J. (1994). Structural estimation of Markov decision processes. Handbook of Econometrics, 4, 3081–3143.

    Article  Google Scholar 

  38. Sloan, F.A., Bethel, M.A., Ruiz, D., Shea, A.H., Feinglos, M.N. (2008). The growing burden of diabetes mellitus in the US elderly population. Archives of Internal Medicine, 168(2), 192–199.

    Article  Google Scholar 

  39. Sloan, F.A., Padrón, N.A., Platt, A.C. (2009). Preferences, beliefs, and self-management of diabetes. Health Services Research, 44(3), 1068–1087.

    Article  Google Scholar 

  40. Sloan, F.A., Eldred, L. M., Xu, Y. (2014). The behavioral economics of drunk driving. Journal of Health Economics, 35, 64–81.

    Article  Google Scholar 

  41. US Department of Health and Human Services. (1994). Preventing tobacco use among young people: A report of the Surgeon General. US Department of Health and Human Services.

  42. Van der Pol, M., & Cairns, J. (2011). Descriptive validity of alternative intertemporal models for health outcomes: An axiomatic test. Health Economics, 20 (7), 770–782.

    Article  Google Scholar 

  43. Yashkin, A.P., Hahn, P., Sloan, F.A. (2016). Introducing anti-vascular endothelial growth factor therapies for AMD did not raise risk of myocardial infarction, stroke, and death. Ophthalmology, 123(10), 2225–2231.

    Article  Google Scholar 

  44. Zhuo, X., Zhang, P., Hoerger, T.J. (2013). Lifetime direct medical costs of treating type 2 diabetes and diabetic complications. American Journal of Preventive Medicine, 45(3), 253–261.

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Yang Wang.

Additional information

Partial support for this research came from a grant from the National Institute on Aging to Duke University (NIA grant R01-AG017473).



This appendix follows Section 2 and provides more details on the identification and estimation of the dynamic discrete choice model. Specifically, given the Extreme Value Distribution assumption, the probability of action i being chosen given x, Pi,t(xt), is:

$$ P_{i, t}(x_{t}) =\Pr \left[W_{i, t}\left( x_{t}\right) +\varepsilon_{i, t}\geq W_{j, t}\left( x\right) +\varepsilon_{j, t}, \forall j\neq i \right] =\frac{ \exp \left[W_{i, t}\left( x_{t}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ W_{j, t}\left( x_{t}\right) \right]}. $$

While W, defined in (4), is not observable, actual choice probabilities Pi,t (xt) are observable in the data and can be used to infer W.

With the choice-specific value function of the next-period self perceived by the current self Zi,t+ 1 (xt+ 1), defined in (5), the current self’s perception of her future self’s choice, σ, can be defined as

$$\begin{array}{@{}rcl@{}} \sigma\left( x_{t + 1},\varepsilon_{i, t + 1}\right) &=&{ arg\max_{i\in \mathcal{I}}}\left[ u_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}+{\tilde{\beta}\delta \sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2} {)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}\right] \\ &=&{arg\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\right] . \end{array} $$

Then the probability perceived by the current period self of choosing alternative i by the next period’s self when the next period’s state, again assuming an Extreme Value Distribution, is xt+ 1, \(\tilde {P}_{i, t + 1}\left (x_{t + 1}\right )\), is:

$$\begin{array}{@{}rcl@{}} \tilde{P}_{i, t + 1}\left( x_{t + 1}\right) &=&\Pr \left[ \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) =i\right] \\ &=&\Pr \left[Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\geq Z_{j, t + 1}\left( x_{t + 1}\right) +\varepsilon_{j, t + 1},\forall j\neq i\right] \\ &=&\frac{\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 1}\right) \right]} . \end{array} $$

The distinction between \(\tilde {P}\) and P is that the sophisticated present-biased decision-maker knows the extent of her actual future present bias; by contrast, the naive person underestimates the extent of her present bias. That is, she thinks her β is larger than it actually will be. For sophisticated persons, \(\tilde {P}= P\); for naive ones, \(\tilde {P} \neq P\).

With non-stationarity and a finite horizon, at t = T, when the continuation value is zero, there is no distinction among W, Z, V, and u:

$$W_{i, T}=Z_{i, T}=V_{i, T}=u_{i, T}, $$

which, according to the Extreme Value Distribution assumption, leads to:

$$ V_{T}=\ln {\sum\limits_{i\in \mathcal{I}}\exp [Z_{i, T}]}=\ln {\sum\limits_{i\in \mathcal{ I}}\exp [u_{i, T}]}. $$

Combining (11) and (5) yields Zi,T− 1. Given the link between Zi,T− 1 and VT− 1, using backward induction, Vt+ 1 can be determined, which in turn relates to Wi,t (4), and then to Pi,t (xt) (9). By this reasoning, we link instantaneous utility u to P, which is observable in the data. Once this relationship between u and P is established empirically from the choice probabilities (Pi,t (xt)) and transition probabilities (π (xt+ 1|xt,i)) for all \(x\in \mathcal {X}\) and for i = (0, 1), we can estimate the utility parameters for a given \(\left \langle \beta ,\tilde {\beta },\delta \right \rangle \).

The relationship between Zi,t and Vt can be described in three steps. First, combining (5) and (6) yields

$$ V_{i, t + 1}\left( x_{t + 1}\right) =Z_{i, t + 1}\left( x_{t + 1}\right) +\left( 1- \tilde{\beta}\right) \delta {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{ )\pi (}x_{t + 2}{{|x_{t + 1}, i)}.} $$

Given (12), (7) can be rewritten as:

$$\begin{array}{@{}rcl@{}} V_{t + 1}\left( x_{t + 1}\right) &=&\mathrm{E}_{\varepsilon_{t + 1}}\left[ V_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1}\right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}\left[ \begin{array}{c} Z_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1} \\ +\left( 1-\tilde{\beta}\right) \delta {{\sum}_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(} }x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{ \varepsilon}_{t + 1}\right) )}} \end{array} \right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &&+\left( 1-\tilde{\beta}\right) \delta \mathrm{E}_{\mathbf{\varepsilon} _{t + 1}}{\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{)\pi (}x_{t + 2}{{ |x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) )}} \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &+&\left( 1-\tilde{\beta}\right) \delta \sum\limits_{i\in \mathcal{I}}\tilde{P} _{i, t + 1}\left( x_{t + 1}\right) {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}. \end{array} $$

Given the Extreme Value Distribution assumption,

$$ \mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}} \{Z_{i, t + 1}(x_{t + 1})+\varepsilon_{i,t + 1}\}=\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\}. $$

Combined with (14) and (10), (13) can be rewritten as

$$\begin{array}{@{}rcl@{}} &&V_{t + 1}\left( x_{t + 1}\right) =\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\} \\ &&+\left( 1-\tilde{\beta}\right) \delta {\sum}_{i\in \mathcal{I}}\frac{\exp \left[Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 2}\right) \right]} {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{(t + 2)}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}, \end{array} $$

which relates Zi,t to Vt, a relationship that makes backward induction possible.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Sloan, F.A. Present bias and health. J Risk Uncertain 57, 177–198 (2018).

Download citation


  • Hyperbolic discounting
  • Present bias
  • Naivete
  • Time preference
  • Diabetes
  • Adherence

JEL Classifications

  • I12
  • D90