Time Consistent Discounting

  • Tor Lattimore
  • Marcus Hutter
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6925)

Abstract

A possibly immortal agent tries to maximise its summed discounted rewards over time, where discounting is used to avoid infinite utilities and encourage the agent to value current rewards more than future ones. Some commonly used discount functions lead to time-inconsistent behavior where the agent changes its plan over time. These inconsistencies can lead to very poor behavior. We generalise the usual discounted utility model to one where the discount function changes with the age of the agent. We then give a simple characterisation of time-(in)consistent discount functions and show the existence of a rational policy for an agent that knows its discount function is time-inconsistent.

Keywords

Rational agents sequential decision theory general discounting time-consistency game theory 

References

  1. 1.
    Frederick, S., Oewenstein, G.L., O’Donoghue, T.: Time discounting and time preference: A critical review. Journal of Economic Literature 40(2) (2002)Google Scholar
  2. 2.
    Fudenberg, D.: Subgame-perfect equilibria of finite and infinite-horizon games. Journal of Economic Theory 31(2) (1983)Google Scholar
  3. 3.
    Goldman, S.M.: Consistent plans. The Review of Economic Studies 47(3), 533–537 (1980)CrossRefMATHGoogle Scholar
  4. 4.
    Green, L., Fristoe, N., Myerson, J.: Temporal discounting and preference reversals in choice between delayed outcomes. Psychonomic bulletin and review 1(3), 383–389 (1994)CrossRefGoogle Scholar
  5. 5.
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2004)MATHGoogle Scholar
  6. 6.
    Hutter, M.: General Discounting Versus Average Reward. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 244–258. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Legg, S.: Machine Super Intelligence. PhD thesis, University of Lugano (2008)Google Scholar
  8. 8.
    Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds & Machines 17(4), 391–444 (2007)CrossRefGoogle Scholar
  9. 9.
    Osborne, M.J., Rubinstein, A.: A Course in Game Theory. The MIT Press, Cambridge (1994)MATHGoogle Scholar
  10. 10.
    Peleg, B., Yaari, M.E.: On the existence of a consistent course of action when tastes are changing. The Review of Economic Studies 40(3), 391–401 (1973)CrossRefMATHGoogle Scholar
  11. 11.
    Pollak, R.A.: Consistent planning. The Review of Economic Studies 35(2), 201–208 (1968)CrossRefGoogle Scholar
  12. 12.
    Samuelson, P.A.: A note on measurement of utility. The Review of Economic Studies 4(2), 155–161 (1937)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Strotz, R.H.: Myopia and inconsistency in dynamic utility maximization. The Review of Economic Studies 23(3), 165–180 (1955)CrossRefGoogle Scholar
  14. 14.
    Thaler, R.: Some empirical evidence on dynamic inconsistency. Economics Letters 8(3), 201–207 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tor Lattimore
    • 1
  • Marcus Hutter
    • 1
    • 2
  1. 1.Research School of Computer ScienceAustralian National UniversityAustralia
  2. 2.ETH ZürichAustralia

Personalised recommendations