Adaptive policies for time-varying stochastic systems under discounted criterion
- 39 Downloads
We consider a class of time-varying stochastic control systems, with Borel state and action spaces, and possibly unbounded costs. The processes evolve according to a discrete-time equation x n + 1=G n (x n , a n , ξn), n=0, 1, … , where the ξn are i.i.d. ℜk-valued random vectors whose common density is unknown, and the G n are given functions converging, in a restricted way, to some function G ∞ as n→∞. Assuming observability of ξn, we construct an adaptive policy which is asymptotically discounted cost optimal for the limiting control system x n+1=G ∞ (x n , a n , ξn).
Unable to display preview. Download preview PDF.