The Kumar-Becker-Lin scheme revisited

Borkar, V. S.

doi:10.1007/BF00939540

The Kumar-Becker-Lin scheme revisited

Contributed Papers
Published: August 1990

Volume 66, pages 289–309, (1990)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

V. S. Borkar¹^nAff2

90 Accesses
8 Citations
Explore all metrics

Abstract

The Kumar-Becker-Lin scheme introduces a slowly vanishing cost bias in the parameter estimation part of self-tuning control in order to improve its performance. This paper establishes the a.s. optimality of a variant of this scheme for Markov chains on a countable state space when the action space is compact metric and the parameter space is a compact subset ofR ^m.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An incremental off-policy search in a model-free Markov decision process using a single sample path

Article 13 February 2018

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

An Incremental Fast Policy Search Using a Single Sample Path

References

Mandl, P.,Estimation and Control in Markov Chains, Advances in Applied Probability, Vol. 6, pp. 40–60, 1974.
Google Scholar
Schäl, M.,Estimation and Control in Discounted Dynamic Programming, Stochastics, Vol. 20, pp. 51–71, 1987.
Google Scholar
Borkar, V. S., andGhosh, M. K.,Ergodic and Adaptive Control of Nearest Neighbor Motions, Mathematics of Control, Signals, and Systems (to appear).
Borkar, V. S., andVaraiya, P. P.,Adaptive Control of Markov Chains, I: Finite Parameter Set, IEEE Transactions on Automatic Control, Vol. AC-24, pp. 953–957, 1979.
Google Scholar
Kumar, P. R., andBecker, A.,A New Family of Optimal Adaptive Controllers for Markov Chains, IEEE Transactions on Automatic Control, Vol. AC-27, pp. 137–146, 1982.
Google Scholar
Kumar, P. R., andLin, W.,Optimal Adaptive Controllers for Markov Chains, IEEE Transactions on Automatic Control, Vol. AC-27, pp. 765–774, 1982.
Google Scholar
Kumar, P. R.,A Survey of Some Results in Stochastic Adaptive Control, SIAM Journal on Control and Optimization, Vol. 23, pp. 329–380, 1985.
Google Scholar
Hajek, B.,Hitting-Time and Occupation-Time Bounds Implied by Drift Analysis with Applications, Advances in Applied Probability, Vol. 14, pp. 502–525, 1982.
Google Scholar
Borkar, V. S.,Control of Markov Chains with Long-Run Average Cost Criterion: The Dynamic Programming Equations, SIAM Journal on Control and Optimization, Vol. 27, pp. 642–657, 1989.
Google Scholar
Beneš, V. E.,Existence of Optimal Strategies Based on Specified Information, for a Class of Stochastic Decision Problems, SIAM Journal on Control and Optimization, Vol. 8, pp. 179–188, 1970.
Google Scholar
Borkar, V. S., andVaraiya, P. P.,Identification and Adaptive Control of Markov Chains, SIAM Journal on Control and Optimization, Vol. 20, pp. 470–488, 1982.
Google Scholar
Borkar, V. S.,A Convex Analytic Approach to Markov Decision Processes, Probability Theory and Related Fields, Vol. 78, pp. 583–602, 1988.
Google Scholar
Borkar, V. S.,Control of Markov Chains with Long-Run Average Cost Criterion, Stochastic Differential Systems, Stochastic Control Theory, and Applications, Edited by W. H. Fleming and P. L. Lions, Springer-Verlag, New York, New York, pp. 57–77, 1988.
Google Scholar
Loève, M.,Probability Theory, Vol. 2, 4th Edition, Springer-Verlag, New York, New York, 1978.
Google Scholar
Neveu, J.,Discrete Parameter Martingales, North-Holland, Amsterdam, Holland, 1975.
Google Scholar
Chow, Y. S., andTeicher, H.,Probability Theory: Independence, Interchange-ability, Martingales, Springer-Verlag, New York, New York, 1978.
Google Scholar
Borkar, V. S., andBagchi, A.,Parameter Estimation in Continuous-Time Stochastic Processes, Stochastics, Vol. 8, pp. 193–212, 1982.
Google Scholar

Download references

Author information

V. S. Borkar (Fellow)
Present address: Department of Electrical Engineering, Indian Institute of Science, Bangalore, India

Authors and Affiliations

Tata Institute of Fundamental Research, Bangalore, India
V. S. Borkar (Fellow)

Authors

V. S. Borkar
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Communicated by P. Varaiya

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borkar, V.S. The Kumar-Becker-Lin scheme revisited. J Optim Theory Appl 66, 289–309 (1990). https://doi.org/10.1007/BF00939540

Download citation

Issue Date: August 1990
DOI: https://doi.org/10.1007/BF00939540

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Kumar-Becker-Lin scheme revisited

Abstract

Access this article

Similar content being viewed by others

An incremental off-policy search in a model-free Markov decision process using a single sample path

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

An Incremental Fast Policy Search Using a Single Sample Path

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key Words

Navigation

The Kumar-Becker-Lin scheme revisited

Abstract

Access this article

Similar content being viewed by others

An incremental off-policy search in a model-free Markov decision process using a single sample path

Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization

An Incremental Fast Policy Search Using a Single Sample Path

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation