Skip to main content
Log in

Analysis of an identification algorithm arising in the adaptive estimation of Markov chains

  • Published:
Mathematics of Control, Signals and Systems Aims and scope Submit manuscript

Abstract

We investigate an algorithm applied to the adaptive estimation of partially observed finite-state Markov chains. The algorithm utilizes the recursive equation characterizing the conditional distribution of the state of the Markov chain, given the past observations. We show that the process “driving” the algorithm has a unique invariant measure for each fixed value of the parameter, and following the ordinary differential equation method for stochastic approximations, establish almost sure convergence of the parameter estimates to the solutions of an associated differential equation. The performance of the adaptive estimation scheme is analyzed by examining the induced controlled Markov process with respect to a long-run average cost criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. K. J. Åström, Optimal control of Markov processes with incomplete state information,J. Math. Anal. Appl.,10 (1965), 174–205.

    Article  MathSciNet  Google Scholar 

  2. J. S. Baras and A. J. Dorsey, Stochastic control of two partially observed competing queues,IEEE Trans. Automat. Control,26 (1981), 1106–1117.

    Article  MathSciNet  Google Scholar 

  3. J. S. Baras and A. J. Dorsey, Adaptive control of two competing queues,Proceedings of the Second IEEE Annual Joint Conference on Infocom, San Diego, CA, 1983, pp. 427–435.

  4. Y. M. El-Fattah, Gradient approach for recursive estimation and control in finite Markov chains,Adv. in Appl. Probab.,13 (1981), 778–803.

    Article  MathSciNet  Google Scholar 

  5. H. Furstenberg and H. Kesten, Products of random matrices,Ann. Math. Statist.,31 (1960), 457–469.

    MathSciNet  Google Scholar 

  6. G. C. Goodwin, P. J. Ramadge, and P. E. Caines, A globally convergent adaptive predictor,IEEE Trans. Automat. Control,25 (1980), 449–456.

    Article  MathSciNet  Google Scholar 

  7. O. Hernández-Lerma and S. I. Marcus, Adaptive control of service in queueing systems,Systems Control Lett.,3 (1983), 283–289.

    Article  MathSciNet  Google Scholar 

  8. O. Hernández-Lerma and S. I. Marcus, Optimal adaptive control of priority assignment in queueing systems,Systems Control Lett.,4 (1984), 65–72.

    Article  MathSciNet  Google Scholar 

  9. K. Hsu and S. I. Marcus, A general martingale approach to discrete-time stochastic control and estimation,IEEE Trans. Automat. Control,24 (1979), 580–583.

    Article  MathSciNet  Google Scholar 

  10. M. Iosifescu and R. Theodorescu,Random Processes and Learning, Springer-Verlag, Berlin, 1969.

    MATH  Google Scholar 

  11. T. Kaijser, A limit theorem for partially observed Markov chains,Ann. Probab.,3 (1975), 677–696.

    MathSciNet  Google Scholar 

  12. T. Kaijser, A limit theorem for Markov chains in compact spaces with applications to products of random matrices,Duke Math. J.,45 (1978), 311–349.

    Article  MathSciNet  Google Scholar 

  13. R. L. Kashyap, Identification of a transition matrix of a Markov chain from noisy measurements of state,IEEE Trans. Inform. Theory,16 (1970), 161–166.

    Article  MathSciNet  Google Scholar 

  14. M. Kolonko, The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter,Math. Operationforsch. Statist. Ser. Optim.,13 (1982), 567–591.

    MathSciNet  Google Scholar 

  15. P. R. Kumar, A survey of some results in stochastic adaptive control,SIAM J. Control Optim.,23 (1985), 329–380.

    Article  MathSciNet  Google Scholar 

  16. H. J. Kushner, Stochastic approximation with discontinuous dynamics and state dependent noise: w.p.1 and weak convergence,J. Math. Anal. Appl.,82 (1981), 527–542.

    Article  MathSciNet  Google Scholar 

  17. H. J. Kushner, An averaging method for stochastic approximations with discontinuous dynamics, constraints, and state dependent noise, inRecent Advances in Statistics (M. H. Rizvi, J. S. Rustagi, and O. Siegmund, eds.), pp. 211–235, Academic Press, New York, 1983.

    Google Scholar 

  18. H. J. Kushner,Approximation and Weak Convergence Methods for Random Processes with Application to Stochastic System Theory, MIT Press, Cambridge, MA, 1984.

    Google Scholar 

  19. H. J. Kushner and D. S. Clark,Stochastic Approximation Methods for Constrained and Unconstrained Systems, Applied Mathematical Sciences, Vol. 26, Springer-Verlag, New York, 1978.

    Google Scholar 

  20. H. J. Kushner and A. Shwartz, An invariant measure approach to the convergence of stochastic approximations with state dependent noise,SIAM J. Control Optim.,22 (1984), 13–27.

    Article  MathSciNet  Google Scholar 

  21. L. Ljung and T. Söderström,Theory and Practice of Recursive Identification, MIT Press, Cambridge, MA, 1983.

    MATH  Google Scholar 

  22. D.-J. Ma and A. M. Makowski, A Simple Problem of Flow Control, II: Implementation of Threshold Policies via Stochastic Approximations, Technical Report 87-100, Systems Research Center, University of Maryland, 1987.

  23. M. Métivier, On stochastic algorithms in adaptive filtering, inStochastic Processes and Their Applications (K. Ito and T. Hida, eds.), pp. 134–156, Lecture Notes in Mathematics, Vol. 1203, Springer-Verlag, Berlin, 1986.

    Chapter  Google Scholar 

  24. M. Métivier and P. Priouret, Théorèmes de convergence presque sure pure une classe d’algorithmes stochastiques à pas décroissant,Probab. Theory Rel. Fields,74 (1987), 403–428.

    Article  Google Scholar 

  25. M. Schäl, Estimation and control in discounted dynamic programing,Stochastics,20 (1987), 51–71.

    MathSciNet  Google Scholar 

  26. J. H. van Schuppen, Convergence results for continuous-time adaptive stochastic filtering algorithms,J. Math. Anal. Appl.,96 (1983), 209–225.

    Article  MathSciNet  Google Scholar 

  27. E. Seneta,Nonnegative Matrices and Markov Chains, Springer-Verlag, New York, 1981.

    Google Scholar 

  28. A. Shwartz, Convergence of Stochastic Approximations: The Invariant Measure Approach, Ph.D. Thesis, Division of Engineering, Brown University, Providence, RI, 1982.

    Google Scholar 

  29. A. Shwartz and A. M. Makowski, An optimal adaptive scheme for two competing queues with constraints, inProceedings of the 7th International Conference on Analysis and Optimization of Systems, Antibes, 1986, pp. 515–532.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research was supported in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860 and in part by the DoD Joint Services Electronics Program through the Air Force Office of Scientific Research (AFSC) Contract F49620-86-C-0045.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arapostathis, A., Marcus, S.I. Analysis of an identification algorithm arising in the adaptive estimation of Markov chains. Math. Control Signal Systems 3, 1–29 (1990). https://doi.org/10.1007/BF02551353

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02551353

Key words

Navigation