Skip to main content

Present bias and health


This study uses a dynamic discrete choice model to examine the degree of present bias and naivete about present bias in individuals’ health care decisions. Clinical guidelines exist for several common chronic diseases. Although the empirical evidence for some guidelines is strong, many individuals with these diseases do not follow the guidelines. Using persons with diabetes as a case study, we find evidence of substantial present bias and naivete. Counterfactual simulations indicate the importance of present bias and naivete in explaining low adherence rates to health care guidelines.

This is a preview of subscription content, access via your institution.


  1. Fang and Wang (2015) develop methodologies for both finite and infinite time horizons. Here, like the empirical analysis in that paper, we assume a finite horizon with a maximum age of 100.

  2. For a formal discussion and technical details of the identification and estimation method, see Fang and Wang (2015).

  3. We know from the standard theories of discrete choice that we have to normalize the utility for the reference alternative to 0, so without loss of generality we set \(u_{0}^{\ast } \left (x,\mathbf {\varepsilon } \right ) = 0\) for all \(x\in \mathcal {X}\).

  4. Conditional Independence Assumption:

    $$\begin{array}{@{}rcl@{}} \pi (x_{t + 1},\mathbf{\varepsilon}_{t + 1}|x_{t},\varepsilon_{t},d_{t}) &=&q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1})\pi (x_{t + 1}|x_{t},d_{t}) \\ q(\mathbf{\varepsilon}_{t + 1}|x_{t + 1}) &=&q(\mathbf{\varepsilon} ). \end{array} $$

    Extreme Value Distribution Assumption: εt is i.i.d Type I extreme value distributed.

  5. Formally, the Exclusion Restriction assumption says that there exist state variables \(x_{1}\in \mathcal {X}\) and \(x_{2}\in \mathcal {X}\ \)with x1x2, such that (1) for all \(i\in \mathcal {I},\) ui (x1) = ui (x2); and (2) for some \(i\in \mathcal {I}\), π (x|x1,i)≠π (x|x2,i).

  6. To check the robustness of our findings, we set Adherence to one if at least two or four of the five questions were answered affirmatively. The results are qualitatively similar.

  7. The total cognition summary score is generated in the RAND version of HRS. The total word recall summary variables sum the immediate and delayed word recall scores. The mental status summary sums the scores for serial 7’s, backwards counting from 20, and object, date, and President/Vice-President naming tasks. The total cognition score sums the total word recall and mental status summary scores, resulting in a range of 0-35. This score has been used in the literature as a good measure of cognitive ability.

  8. In robustness checks not shown, we set LowCog to one when the score is below 10 or 15, and the results are qualitatively similar.

  9. In the structural but not in the reduced form analysis presented below, we discretize LogIncome to estimate state transitions and utility and time preference; we set the person’s instantaneous utility when dead to zero.

  10., accessed 10/01/17.

  11. There might be some future non-related costs. For example, increased life expectancy resulting from increased adherence could lead to an increase in the number of other diseases (e.g. Alzheimer’s disease). The increase in adherence rates therefore might not necessarily be cost-saving when considering from a societal perspective.


  • Abdellaoui, M., Attema, A.E., Bleichrodt, H. (2010). Intertemporal tradeoffs for gains and losses: An experimental measurement of discounted utility. The Economic Journal, 120(545), 845–866.

    Article  Google Scholar 

  • Akin, Z. (2012). Intertemporal decision making with present biased preferences. Journal of Economic Psychology, 33(1), 30–47.

    Article  Google Scholar 

  • Ali, M.K., Bullard, K.M., Saaddine, J.B., Cowie, C.C., Imperatore, G., Gregg, E.W. (2013). Achievement of goals in US diabetes care, 1999–2010. New England Journal of Medicine, 368(17), 1613–1624.

    Article  Google Scholar 

  • American Diabetes Association. (2013). Economic costs of diabetes in the US in 2012. Diabetes Care, 36(4), 1033–1046.

    Article  Google Scholar 

  • Andreoni, J., Kuhn, M.A., Sprenger, C. (2015). Measuring time preferences: A comparison of experimental methods. Journal of Economic Behavior & Organization, 116, 451–464.

    Article  Google Scholar 

  • Andreoni, J., & Sprenger, C. (2012). Estimating time preferences from convex budgets. American Economic Review, 102(7), 3333–56.

    Article  Google Scholar 

  • Arcidiacono, P., Sieg, H., Sloan, F. (2007). Living rationally under the volcano? An empirical analysis of heavy drinking and smoking. International Economic Review, 48(1), 37–65.

    Article  Google Scholar 

  • Ariely, D., & Wertenbroch, K.X. (2002). Procrastination, deadlines, and performance: Self-control by precommitment. Psychological Science, 13(3), 219–224.

    Article  Google Scholar 

  • Attema, A.E., Bleichrodt, H., L’Haridon, O., Peretti-Watel, P., Seror, V. (2018). Discounting health and money: New evidence using a more robust method. Journal of Risk and Uncertainty, 56(2), 117–140.

    Article  Google Scholar 

  • Augenblick, N., Niederle, M., Sprenger, C. (2015). Working over time: Dynamic inconsistency in real effort tasks. The Quarterly Journal of Economics, 130 (3), 1067–1115.

    Article  Google Scholar 

  • Baicker, K., Mullainathan, S., Schwartzstein, J. (2015). Behavioral hazard in health insurance. The Quarterly Journal of Economics, 130(4), 1623–1667.

    Article  Google Scholar 

  • Bickel, W.K., Odum, A.L., Madden, G.J. (1999). Impulsivity and cigarette smoking: Delay discounting in current, never, and ex-smokers. Psychopharmacology, 146(4), 447–454.

    Article  Google Scholar 

  • Bleichrodt, H., Gao, Y., Rohde, K.I. (2016). A measurement of decreasing impatience for health and money. Journal of Risk and Uncertainty, 52(3), 213–231.

    Article  Google Scholar 

  • Boyle, J.P., Honeycutt, A.A., Narayan, K.V., Hoerger, T.J., Geiss, L.S., Chen, H., Thompson, T.J. (2001). Projection of diabetes burden through 2050: Impact of changing demography and disease prevalence in the US. Diabetes Care, 24(11), 1936–1940.

    Article  Google Scholar 

  • Bradford, D., Courtemanche, C., Heutel, G., McAlvanah, P., Ruhm, C. (2017). Time preferences and consumer behavior. Journal of Risk and Uncertainty, 55(2-3), 119–145.

    Article  Google Scholar 

  • Cavagnaro, D.R., Aranovich, G.J., McClure, S.M., Pitt, M.A., Myung, J.I. (2016). On the functional form of temporal discounting: an optimized adaptive test. Journal of Risk and Uncertainty, 52(3), 233–254.

    Article  Google Scholar 

  • Chapman, G.B. (1996). Temporal discounting and utility for health and money. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(3), 771.

    Google Scholar 

  • Chen, Y., Sloan, F.A., Yashkin, A.P. (2015). Adherence to diabetes guidelines for screening, physical activity and medication and onset of complications and death. Journal of Diabetes and its Complications, 29(8), 1228–1233.

    Article  Google Scholar 

  • Chung, D.J., Steenburgh, T., Sudhir, K. (2013). Do bonuses enhance sales productivity? A dynamic structural analysis of bonus-based compensation plans. Marketing Science, 33(2), 165–187.

    Article  Google Scholar 

  • Courtemanche, C., Heutel, G., McAlvanah, P. (2015). Impatience, incentives and obesity. The Economic Journal, 125(582), 1–31.

    Article  Google Scholar 

  • Cowie, C.C., Rust, K.F., Byrd-Holt, D.D., Eberhardt, M.S., Flegal, K.M., Engelgau, M.M., Saydah, S.H., Williams, D.E., Geiss, L.S., Gregg, E.W. (2006). Prevalence of diabetes and impaired fasting glucose in adults in the US population: National Health And Nutrition Examination Survey 1999–2002. Diabetes Care, 29(6), 1263–1268.

    Article  Google Scholar 

  • DellaVigna, S., & Malmendier, U. (2006). Paying not to go to the gym. American Economic Review, 96(3), 694–719.

    Article  Google Scholar 

  • DiMatteo, M.R. (2004). Variations in patients’ adherence to medical recommendations: A quantitative review of 50 years of research. Medical Care, 42(3), 200–209.

    Article  Google Scholar 

  • Fang, H., & Wang, Y. (2015). Estimating dynamic discrete choice models with hyperbolic discounting, with an application to mammography decisions. International Economic Review, 56(2), 565–596.

    Article  Google Scholar 

  • Ferecatu, A., & Önçüler, A. (2016). Heterogeneous risk and time preferences. Journal of Risk and Uncertainty, 53(1), 1–28.

    Article  Google Scholar 

  • Gruber, J., & Köszegi, B. (2001). Is addiction “rational”? Theory and evidence. The Quarterly Journal of Economics, 116(4), 1261–1303.

    Article  Google Scholar 

  • Hinvest, N.S., & Anderson, I.M. (2010). The effects of real versus hypothetical reward on delay and probability discounting. Quarterly Journal of Experimental Psychology, 63(6), 1072–1084.

    Article  Google Scholar 

  • Ho, P.M., Rumsfeld, J.S., Masoudi, F.A., McClure, D.L., Plomondon, M.E., Steiner, J.F., Magid, D.J. (2006). Effect of medication nonadherence on hospitalization and mortality among patients with diabetes mellitus. Archives of Internal Medicine, 166(17), 1836–1841.

    Article  Google Scholar 

  • Ikeda, S., Kang, M.I., Ohtake, F. (2010). Hyperbolic discounting, the sign effect, and the body mass index. Journal of Health Economics, 29(2), 268–284.

    Article  Google Scholar 

  • Ioannou, C.A., & Sadeh, J. (2016). Time preferences and risk aversion: Tests on domain differences. Journal of Risk and Uncertainty, 53(1), 29–54.

    Article  Google Scholar 

  • Kan, K. (2007). Cigarette smoking and self-control. Journal of Health Economics, 26(1), 61–81.

    Article  Google Scholar 

  • Madrian, B.C., & Shea, D.F. (2001). The power of suggestion: Inertia in 401(k) participation and savings behavior. The Quarterly Journal of Economics, 116(4), 1149–1187.

    Article  Google Scholar 

  • Magnac, T., & Thesmar, D. (2002). Identifying dynamic discrete decision processes. Econometrica, 70(2), 801–816.

    Article  Google Scholar 

  • Mokdad, A.H., Marks, J.S., Stroup, D.F., Gerberding, J.L. (2004). Actual causes of death in the United States, 2000. Journal of the American Medical Association, 291(10), 1238–1245.

    Article  Google Scholar 

  • Norris, S.L., Engelgau, M.M., Narayan, K.V. (2001). Effectiveness of self-management training in type 2 diabetes: A systematic review of randomized controlled trials. Diabetes Care, 24(3), 561–587.

    Article  Google Scholar 

  • O’Donoghue, T., & Rabin, M. (1999). Doing it now or later. American Economic Review, 89(1), 103–124.

    Article  Google Scholar 

  • Rust, J. (1994). Structural estimation of Markov decision processes. Handbook of Econometrics, 4, 3081–3143.

    Article  Google Scholar 

  • Sloan, F.A., Bethel, M.A., Ruiz, D., Shea, A.H., Feinglos, M.N. (2008). The growing burden of diabetes mellitus in the US elderly population. Archives of Internal Medicine, 168(2), 192–199.

    Article  Google Scholar 

  • Sloan, F.A., Padrón, N.A., Platt, A.C. (2009). Preferences, beliefs, and self-management of diabetes. Health Services Research, 44(3), 1068–1087.

    Article  Google Scholar 

  • Sloan, F.A., Eldred, L. M., Xu, Y. (2014). The behavioral economics of drunk driving. Journal of Health Economics, 35, 64–81.

    Article  Google Scholar 

  • US Department of Health and Human Services. (1994). Preventing tobacco use among young people: A report of the Surgeon General. US Department of Health and Human Services.

  • Van der Pol, M., & Cairns, J. (2011). Descriptive validity of alternative intertemporal models for health outcomes: An axiomatic test. Health Economics, 20 (7), 770–782.

    Article  Google Scholar 

  • Yashkin, A.P., Hahn, P., Sloan, F.A. (2016). Introducing anti-vascular endothelial growth factor therapies for AMD did not raise risk of myocardial infarction, stroke, and death. Ophthalmology, 123(10), 2225–2231.

    Article  Google Scholar 

  • Zhuo, X., Zhang, P., Hoerger, T.J. (2013). Lifetime direct medical costs of treating type 2 diabetes and diabetic complications. American Journal of Preventive Medicine, 45(3), 253–261.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Yang Wang.

Additional information

Partial support for this research came from a grant from the National Institute on Aging to Duke University (NIA grant R01-AG017473).



This appendix follows Section 2 and provides more details on the identification and estimation of the dynamic discrete choice model. Specifically, given the Extreme Value Distribution assumption, the probability of action i being chosen given x, Pi,t(xt), is:

$$ P_{i, t}(x_{t}) =\Pr \left[W_{i, t}\left( x_{t}\right) +\varepsilon_{i, t}\geq W_{j, t}\left( x\right) +\varepsilon_{j, t}, \forall j\neq i \right] =\frac{ \exp \left[W_{i, t}\left( x_{t}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ W_{j, t}\left( x_{t}\right) \right]}. $$

While W, defined in (4), is not observable, actual choice probabilities Pi,t (xt) are observable in the data and can be used to infer W.

With the choice-specific value function of the next-period self perceived by the current self Zi,t+ 1 (xt+ 1), defined in (5), the current self’s perception of her future self’s choice, σ, can be defined as

$$\begin{array}{@{}rcl@{}} \sigma\left( x_{t + 1},\varepsilon_{i, t + 1}\right) &=&{ arg\max_{i\in \mathcal{I}}}\left[ u_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}+{\tilde{\beta}\delta \sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2} {)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}\right] \\ &=&{arg\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\right] . \end{array} $$

Then the probability perceived by the current period self of choosing alternative i by the next period’s self when the next period’s state, again assuming an Extreme Value Distribution, is xt+ 1, \(\tilde {P}_{i, t + 1}\left (x_{t + 1}\right )\), is:

$$\begin{array}{@{}rcl@{}} \tilde{P}_{i, t + 1}\left( x_{t + 1}\right) &=&\Pr \left[ \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) =i\right] \\ &=&\Pr \left[Z_{i, t + 1}\left( x_{t + 1}\right) +\varepsilon_{i, t + 1}\geq Z_{j, t + 1}\left( x_{t + 1}\right) +\varepsilon_{j, t + 1},\forall j\neq i\right] \\ &=&\frac{\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 1}\right) \right]} . \end{array} $$

The distinction between \(\tilde {P}\) and P is that the sophisticated present-biased decision-maker knows the extent of her actual future present bias; by contrast, the naive person underestimates the extent of her present bias. That is, she thinks her β is larger than it actually will be. For sophisticated persons, \(\tilde {P}= P\); for naive ones, \(\tilde {P} \neq P\).

With non-stationarity and a finite horizon, at t = T, when the continuation value is zero, there is no distinction among W, Z, V, and u:

$$W_{i, T}=Z_{i, T}=V_{i, T}=u_{i, T}, $$

which, according to the Extreme Value Distribution assumption, leads to:

$$ V_{T}=\ln {\sum\limits_{i\in \mathcal{I}}\exp [Z_{i, T}]}=\ln {\sum\limits_{i\in \mathcal{ I}}\exp [u_{i, T}]}. $$

Combining (11) and (5) yields Zi,T− 1. Given the link between Zi,T− 1 and VT− 1, using backward induction, Vt+ 1 can be determined, which in turn relates to Wi,t (4), and then to Pi,t (xt) (9). By this reasoning, we link instantaneous utility u to P, which is observable in the data. Once this relationship between u and P is established empirically from the choice probabilities (Pi,t (xt)) and transition probabilities (π (xt+ 1|xt,i)) for all \(x\in \mathcal {X}\) and for i = (0, 1), we can estimate the utility parameters for a given \(\left \langle \beta ,\tilde {\beta },\delta \right \rangle \).

The relationship between Zi,t and Vt can be described in three steps. First, combining (5) and (6) yields

$$ V_{i, t + 1}\left( x_{t + 1}\right) =Z_{i, t + 1}\left( x_{t + 1}\right) +\left( 1- \tilde{\beta}\right) \delta {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{ )\pi (}x_{t + 2}{{|x_{t + 1}, i)}.} $$

Given (12), (7) can be rewritten as:

$$\begin{array}{@{}rcl@{}} V_{t + 1}\left( x_{t + 1}\right) &=&\mathrm{E}_{\varepsilon_{t + 1}}\left[ V_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1}\right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}\left[ \begin{array}{c} Z_{\sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right), t + 1} \left( x_{t + 1}\right) +\varepsilon_{\sigma\left( x_{t + 1}, \mathbf{\varepsilon}_{t + 1}\right), t + 1} \\ +\left( 1-\tilde{\beta}\right) \delta {{\sum}_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(} }x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{ \varepsilon}_{t + 1}\right) )}} \end{array} \right] \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &&+\left( 1-\tilde{\beta}\right) \delta \mathrm{E}_{\mathbf{\varepsilon} _{t + 1}}{\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}}x_{t + 2}{)\pi (}x_{t + 2}{{ |x_{t + 1}, \sigma\left( x_{t + 1},\mathbf{\varepsilon}_{t + 1}\right) )}} \\ &=&\mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}}\left[ Z_{i, t + 1}\left( x_{t + 1}\right) +{\varepsilon}_{i,t + 1}\right] \\ &+&\left( 1-\tilde{\beta}\right) \delta \sum\limits_{i\in \mathcal{I}}\tilde{P} _{i, t + 1}\left( x_{t + 1}\right) {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{t + 2}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}. \end{array} $$

Given the Extreme Value Distribution assumption,

$$ \mathrm{E}_{\mathbf{\varepsilon}_{t + 1}}{\max_{i\in \mathcal{I}}} \{Z_{i, t + 1}(x_{t + 1})+\varepsilon_{i,t + 1}\}=\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\}. $$

Combined with (14) and (10), (13) can be rewritten as

$$\begin{array}{@{}rcl@{}} &&V_{t + 1}\left( x_{t + 1}\right) =\ln \left\{ \sum\limits_{i\in \mathcal{I}}\exp \left[ Z_{i, t + 1}\left( x_{t + 1}\right) \right] \right\} \\ &&+\left( 1-\tilde{\beta}\right) \delta {\sum}_{i\in \mathcal{I}}\frac{\exp \left[Z_{i, t + 1}\left( x_{t + 1}\right) \right]} {{\sum}_{j = 0}^{1}\exp \left[ Z_{j, t + 1}\left( x_{t + 2}\right) \right]} {\sum\limits_{x_{t + 2}\in \mathcal{X}}{V_{(t + 2)}(}} x_{t + 2}{)\pi (}x_{t + 2}{{|x_{t + 1}, i)}}, \end{array} $$

which relates Zi,t to Vt, a relationship that makes backward induction possible.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Sloan, F.A. Present bias and health. J Risk Uncertain 57, 177–198 (2018).

Download citation

  • Published:

  • Issue Date:

  • DOI:


JEL Classifications