Abstract
We investigate an algorithm applied to the adaptive estimation of partially observed finite-state Markov chains. The algorithm utilizes the recursive equation characterizing the conditional distribution of the state of the Markov chain, given the past observations. We show that the process “driving” the algorithm has a unique invariant measure for each fixed value of the parameter, and following the ordinary differential equation method for stochastic approximations, establish almost sure convergence of the parameter estimates to the solutions of an associated differential equation. The performance of the adaptive estimation scheme is analyzed by examining the induced controlled Markov process with respect to a long-run average cost criterion.
Similar content being viewed by others
References
K. J. Åström, Optimal control of Markov processes with incomplete state information,J. Math. Anal. Appl.,10 (1965), 174–205.
J. S. Baras and A. J. Dorsey, Stochastic control of two partially observed competing queues,IEEE Trans. Automat. Control,26 (1981), 1106–1117.
J. S. Baras and A. J. Dorsey, Adaptive control of two competing queues,Proceedings of the Second IEEE Annual Joint Conference on Infocom, San Diego, CA, 1983, pp. 427–435.
Y. M. El-Fattah, Gradient approach for recursive estimation and control in finite Markov chains,Adv. in Appl. Probab.,13 (1981), 778–803.
H. Furstenberg and H. Kesten, Products of random matrices,Ann. Math. Statist.,31 (1960), 457–469.
G. C. Goodwin, P. J. Ramadge, and P. E. Caines, A globally convergent adaptive predictor,IEEE Trans. Automat. Control,25 (1980), 449–456.
O. Hernández-Lerma and S. I. Marcus, Adaptive control of service in queueing systems,Systems Control Lett.,3 (1983), 283–289.
O. Hernández-Lerma and S. I. Marcus, Optimal adaptive control of priority assignment in queueing systems,Systems Control Lett.,4 (1984), 65–72.
K. Hsu and S. I. Marcus, A general martingale approach to discrete-time stochastic control and estimation,IEEE Trans. Automat. Control,24 (1979), 580–583.
M. Iosifescu and R. Theodorescu,Random Processes and Learning, Springer-Verlag, Berlin, 1969.
T. Kaijser, A limit theorem for partially observed Markov chains,Ann. Probab.,3 (1975), 677–696.
T. Kaijser, A limit theorem for Markov chains in compact spaces with applications to products of random matrices,Duke Math. J.,45 (1978), 311–349.
R. L. Kashyap, Identification of a transition matrix of a Markov chain from noisy measurements of state,IEEE Trans. Inform. Theory,16 (1970), 161–166.
M. Kolonko, The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter,Math. Operationforsch. Statist. Ser. Optim.,13 (1982), 567–591.
P. R. Kumar, A survey of some results in stochastic adaptive control,SIAM J. Control Optim.,23 (1985), 329–380.
H. J. Kushner, Stochastic approximation with discontinuous dynamics and state dependent noise: w.p.1 and weak convergence,J. Math. Anal. Appl.,82 (1981), 527–542.
H. J. Kushner, An averaging method for stochastic approximations with discontinuous dynamics, constraints, and state dependent noise, inRecent Advances in Statistics (M. H. Rizvi, J. S. Rustagi, and O. Siegmund, eds.), pp. 211–235, Academic Press, New York, 1983.
H. J. Kushner,Approximation and Weak Convergence Methods for Random Processes with Application to Stochastic System Theory, MIT Press, Cambridge, MA, 1984.
H. J. Kushner and D. S. Clark,Stochastic Approximation Methods for Constrained and Unconstrained Systems, Applied Mathematical Sciences, Vol. 26, Springer-Verlag, New York, 1978.
H. J. Kushner and A. Shwartz, An invariant measure approach to the convergence of stochastic approximations with state dependent noise,SIAM J. Control Optim.,22 (1984), 13–27.
L. Ljung and T. Söderström,Theory and Practice of Recursive Identification, MIT Press, Cambridge, MA, 1983.
D.-J. Ma and A. M. Makowski, A Simple Problem of Flow Control, II: Implementation of Threshold Policies via Stochastic Approximations, Technical Report 87-100, Systems Research Center, University of Maryland, 1987.
M. Métivier, On stochastic algorithms in adaptive filtering, inStochastic Processes and Their Applications (K. Ito and T. Hida, eds.), pp. 134–156, Lecture Notes in Mathematics, Vol. 1203, Springer-Verlag, Berlin, 1986.
M. Métivier and P. Priouret, Théorèmes de convergence presque sure pure une classe d’algorithmes stochastiques à pas décroissant,Probab. Theory Rel. Fields,74 (1987), 403–428.
M. Schäl, Estimation and control in discounted dynamic programing,Stochastics,20 (1987), 51–71.
J. H. van Schuppen, Convergence results for continuous-time adaptive stochastic filtering algorithms,J. Math. Anal. Appl.,96 (1983), 209–225.
E. Seneta,Nonnegative Matrices and Markov Chains, Springer-Verlag, New York, 1981.
A. Shwartz, Convergence of Stochastic Approximations: The Invariant Measure Approach, Ph.D. Thesis, Division of Engineering, Brown University, Providence, RI, 1982.
A. Shwartz and A. M. Makowski, An optimal adaptive scheme for two competing queues with constraints, inProceedings of the 7th International Conference on Analysis and Optimization of Systems, Antibes, 1986, pp. 515–532.
Author information
Authors and Affiliations
Additional information
This research was supported in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860 and in part by the DoD Joint Services Electronics Program through the Air Force Office of Scientific Research (AFSC) Contract F49620-86-C-0045.
Rights and permissions
About this article
Cite this article
Arapostathis, A., Marcus, S.I. Analysis of an identification algorithm arising in the adaptive estimation of Markov chains. Math. Control Signal Systems 3, 1–29 (1990). https://doi.org/10.1007/BF02551353
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02551353