Abstract
The theoretical foundations for this class of models have been summarized in Easley and Kiefer (1988). A discrete time decision problem is considered where the decisionmaker chooses an action r in each period to maximize total expected discounted reward depending on the action chosen and the outcome, a random variable. The conditional distribution f(.|r, ß) of the outcome given the action depends on an initially unknown parameter ß. The decisionmaker begins with a prior belief about the unknown parameter and at the end of each period updates it via Bayes’ rule utilizing the latest observations on the action taken and the outcome. Easley and Kiefer take the additional simplifying step of integrating out the outcome and redefining the maximand to be the total expected discounted mean reward where the mean is calculated with respect to the conditional distribution f(.|r,ß) and the belief distribution.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Horvath, B. (1991). A Paradigmatic Example. In: Are Policy Variables Exogenous?. Lecture Notes in Economics and Mathematical Systems, vol 364. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-58211-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-58211-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54287-2
Online ISBN: 978-3-642-58211-0
eBook Packages: Springer Book Archive