Skip to main content

Cybernetics, Machine Learning, and Stochastic Learning Automata

  • Chapter
  • First Online:
Springer Handbook of Automation

Part of the book series: Springer Handbooks ((SHB))

  • 4134 Accesses

Abstract

This chapter presents the area of Cybernetics and how it is related to Machine Learning (ML), Learning Automata (LA), Re-inforcement Learning (RL) and Estimator Algorithms – all considered topics of Artificial Intelligence. In particular, Learning Automata are probabilistic finite state machines which have been used to model how biological systems can learn. The structure of such a machine can be fixed, or it can be changing with time. A Learning Automaton can also be implemented using action (choosing) probability updating rules which may or may not depend on estimates from the Environment being investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 309.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 399.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agache, M.: Estimator Based Learning Algorithms. M.C.S. Thesis, School of Computer Science, Carleton University, Ottawa, Ontario, Canada, 2000

    Google Scholar 

  2. Agache, M., Oommen, B.J.: Generalized pursuit learning schemes: new families of continuous and discretized learning Automata. IEEE Trans. Syst. Man Cybernet. B, 32(6), 738–749 (2002)

    Google Scholar 

  3. Atkinson, C.R., Bower, G.H., Crowthers, E.J.: An Introduction to Mathematical Learning Theory. Wiley, New York (1965)

    Google Scholar 

  4. Atlasis, A.F., Saltouros, M.P., Vasilakos, A.V.: On the use of a stochastic estimator learning algorithm to the ATM routing problem: a methodology. Proc. IEEE GLOBECOM 21(6), 538–546 (1998)

    Google Scholar 

  5. Atlassis, A.F., Loukas, N.H., Vasilakos, A.V.: The use of learning algorithms in ATM networks call admission control problem: a methodology. Comput. Netw. 34, 341–353 (2000)

    Google Scholar 

  6. Atlassis, A.F., Vasilakos, A.V.: The use of reinforcement learning algorithms in traffic control of high speed networks. In: Advances in Computational Intelligence and Learning, pp. 353–369 (2002)

    Google Scholar 

  7. Barabanov, N.E., Prokhorov, D.V.: Stability analysis of discrete-time recurrent neural networks. IEEE Trans. Neural Netw. 13(2), 292–303 (2002)

    Google Scholar 

  8. Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artif. Intell. 72(1–2), 81–138 (1995)

    Google Scholar 

  9. Bonassi, F., Terzi, E., Farina, M., Scattolini, R.: LSTM neural networks: Input to state stability and probabilistic safety verification. In: Learning for Dynamics and Control. PMLR, pp. 85–94 (2020)

    Google Scholar 

  10. Bush, R.R., Mosteller, F.: Stochastic Models for Learning. Wiley, New York (1958)

    MATH  Google Scholar 

  11. Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: Recursive learning automata approach to Markov decision processes,. IEEE Trans. Autom. Control 52(7), 1349–1355 (2007)

    MathSciNet  MATH  Google Scholar 

  12. Hashem, M.K.: Learning Automata-Based Intelligent Tutorial-like Systems, Ph.D. Dissertation, School of Computer Science, Carleton University, Ottawa, Canada, 2007

    Google Scholar 

  13. Huang, D., Jiang, W.: A general CPL-AdS methodology for fixing dynamic parameters in dual environments. IEEE Trans. Syst. Man Cybern. SMC-42, 1489–1500 (2012)

    Google Scholar 

  14. Kabudian, J., Meybodi, M.R., Homayounpour, M.M.: Applying continuous action reinforcement learning automata (CARLA) to global training of hidden Markov models. In: Proceedings of ITCC’04, the International Conference on Information Technology: Coding and Computing, pp. 638–642. Las Vegas, Nevada, 2004

    Google Scholar 

  15. Krinsky, V.I.: An asymptotically optimal automaton with exponential convergence. Biofizika 9, 484–487 (1964)

    Google Scholar 

  16. Krylov, V.: On the stochastic automaton which is asymptotically optimal in random medium. Autom. Remote Control 24, 1114–1116 (1964)

    MATH  Google Scholar 

  17. Lakshmivarahan, S.: Learning Algorithms Theory and Applications. Springer, Berlin (1981)

    MATH  Google Scholar 

  18. Lanctôt, J.K., Oommen, B.J.: Discretized estimator learning automata. IEEE Trans. Syst. Man Cybern. 22, 1473–1483 (1992)

    MathSciNet  Google Scholar 

  19. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Google Scholar 

  20. Lee, S., Kim, J., Park, S.W., Jin, S.-M., Park, S.-M.: Toward a fully automated artificial pancreas system using a bioinspired reinforcement learning design: IEEE J. Biomed. Health Inform. 25(2), 536–546 (2020)

    Google Scholar 

  21. Meybodi, M.R., Beigy, H.: New learning automata based algorithms for adaptation of backpropagation algorithm pararmeters. Int. J. Neural Syst. 12, 45–67 (2002)

    Google Scholar 

  22. Mofrad, A.A., Yazidi, A., Hammer, H.L.: On solving the SPL problem using the concept of probability flux. Appl. Intell. 49, 2699–2722 (2019)

    Google Scholar 

  23. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  24. Misra, S., Oommen, B.J.: GPSPA : a new adaptive algorithm for maintaining shortest path routing trees in stochastic networks. Int. J. Commun. Syst. 17, 963–984 (2004)

    Google Scholar 

  25. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Google Scholar 

  26. Najim, K., Poznyak, A.S.: Learning Automata: Theory and Applications. Pergamon Press, Oxford (1994)

    MATH  Google Scholar 

  27. Narendra, K.S., Thathachar, M.A.L.: Learning Automata. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  28. Nian, R., Liu, J., Huang, B.: A review on reinforcement learning: introduction and applications in industrial process control. Comput. Chem. Eng. 139, 106886 (2020)

    Google Scholar 

  29. Norman, M.F.: On linear models with two absorbing barriers. J. Math. Psychol. 5, 225–241 (1968)

    MathSciNet  MATH  Google Scholar 

  30. Nowé, A., Verbeeck, K., Peeters, M.: Learning automata as a basis for multi agent reinforcement learning. In: International Workshop on Learning and Adaption in Multi-Agent Systems, pp. 71–85. Springer, Berlin (2005)

    Google Scholar 

  31. Obaidat, M.S., Papadimitriou, G.I., Pomportsis, A.S.: Learning automata: theory, paradigms, and applications. IEEE Trans. Syst. Man Cybern. B 32, 706–709 (2002)

    Google Scholar 

  32. Obaidat, M.S., Papadimitriou, G.I., Pomportsis, A.S., Laskaridis, H.S.: Learning automata-based bus arbitration for shared-medium ATM switches. IEEE Trans. Syst. Man Cybern. B 32, 815–820 (2002)

    Google Scholar 

  33. Oommen, B.J., Christensen, J.P.R.: 𝜖-optimal discretized linear reward-penalty learning automata. IEEE Trans. Syst. Man Cybern. B 18, 451–457 (1998)

    Google Scholar 

  34. Oommen, B.J., Hansen, E.R.: The asymptotic optimality of discretized linear reward-inaction learning automata. IEEE Trans. Syst. Man Cybern. 14, 542–545 (1984)

    MathSciNet  MATH  Google Scholar 

  35. Oommen, B.J.: Absorbing and ergodic discretized two action learning automata. IEEE Trans. Syst. Man Cybern. 16, 282–293 (1986)

    MathSciNet  MATH  Google Scholar 

  36. Oommen, B.J., de St. Croix, E.V.: Graph partitioning using learning automata. IEEE Trans. Comput. C-45, 195–208 (1995)

    Google Scholar 

  37. Oommen, B.J., Roberts, T.D.: Continuous learning automata solutions to the capacity assignment problem. IEEE Trans. Comput. C-49, 608–620 (2000)

    Google Scholar 

  38. Oommen, B.J.: Stochastic searching on the line and its applications to parameter learning in nonlinear optimization. IEEE Trans. Syst. Man Cybern. SMC-27, 733–739 (1997)

    Google Scholar 

  39. Oommen, B.J., Raghunath, G.: Automata learning and intelligent tertiary searching for stochastic point location. IEEE Trans. Syst. Man Cybern. SMC-28B, 947–954 (1998)

    Google Scholar 

  40. Oommen, B.J., Raghunath, G., Kuipers, B.: Parameter learning from stochastic teachers and stochastic compulsive liars. IEEE Trans. Syst. Man Cybern. B 36, 820–836 (2006)

    Google Scholar 

  41. Papadimitriou, G.I., Pomportsis, A.S.: Learning-automata-based TDMA protocols for broadcast communication systems with bursty traffic. IEEE Commun. Lett. 4(3)107–109 (2000)

    Google Scholar 

  42. Poznyak, A.S., Najim, K.: Learning Automata and Stochastic Optimization. Springer, Berlin (1997)

    MATH  Google Scholar 

  43. Return of cybernetics. Nat. Mach. Intell. 1(9), 385–385, 2019. https://doi.org/10.1038/s42256-019-0100-x

  44. Ryan, M., Omkar, T.: On 𝜖-optimality of the pursuit learning algorithm. J. Appl. Probab. 49(3), 795–805 (2012)

    MathSciNet  MATH  Google Scholar 

  45. Sastry, P.S.: Systems of Learning Automata: Estimator Algorithms Applications, Ph.D. Thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India, 1985

    Google Scholar 

  46. Santharam, G., Sastry, P.S., Thathachar, M.A.L.: Continuous action set learning automata for stochastic optimization. J. Franklin Inst. 331B5, 607–628 (1994)

    Google Scholar 

  47. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference On Machine Learning, pp. 1889–1897. PMLR (2015)

    Google Scholar 

  48. Seredynski, F.: Distributed scheduling using simple learning machines. Eur. J. Oper. Res. 107, 401–413 (1998)

    MATH  Google Scholar 

  49. Shapiro, I.J., Narendra, K.S.: Use of stochastic automata for parameter self-optimization with multi-modal performance criteria. IEEE Trans. Syst. Sci. Cybern. SSC-5, 352–360 (1969)

    MATH  Google Scholar 

  50. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  51. Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. Mag. 12(2), 19–22 (1992)

    Google Scholar 

  52. Tao, T., Ge, H., Cai, G., Li, S.: Adaptive step searching for solving stochastic point location problem. Intern. Conf. Intel. Comput. Theo ICICT-13, 192–198 (2013)

    Google Scholar 

  53. Thathachar, M.A.L., Oommen, B.J.: Discretized reward-inaction learning automata. J. Cybern. Inform. Sci. Spring, 24–29 (1979)

    Google Scholar 

  54. Thathachar, M.A.L., Sastry, P.S.: Pursuit algorithm for learning automata. Unpublished paper that can be available from the authors.

    Google Scholar 

  55. Thathachar, M.A.L., Sastry, P.S.: A new approach to designing reinforcement schemes for learning automata. In: Proceedings of the IEEE International Conference on Cybernetics and Society, Bombay, India, 1984

    Google Scholar 

  56. Thathachar, M.A.L., Sastry, P.S.: A class of rapidly converging algorithms for learning automata. IEEE Trans. Syst. Man Cybern. SMC-15, 168–175 (1985)

    MATH  Google Scholar 

  57. Thathachar, M.A.L., Sastry, P.S.: Estimator algorithms for learning automata. In: Proceedings of the Platinum Jubilee Conference on Systems and Signal Processing. Department of Electrical Engineering, Indian Institute of Science, Bangalore (1986)

    Google Scholar 

  58. Thathachar, M.A.L.T., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic, Boston (2003)

    Google Scholar 

  59. Tsetlin, M.L.: On the behaviour of finite automata in random media. Autom. Remote Control 22, 1210–1219 (1962). Originally in Avtomatika i Telemekhanika 22, 1345–1354 (1961)

    Google Scholar 

  60. Tsetlin, M.L.: Automaton Theory and Modeling of Biological Systems. Academic Press, New York (1973)

    MATH  Google Scholar 

  61. Unsal, C., Kachroo, P., Bay, J.S.: Simulation study of multiple intelligent vehicle control using stochastic learning automata. Trans. Soc. Comput. Simul. Int. 14, 193–210 (1997)

    Google Scholar 

  62. Varshavskii, V.I., Vorontsova, I.P.: On the behavior of stochastic automata with a variable structure. Autom. Remote Control 24, 327–333 (1963)

    MathSciNet  MATH  Google Scholar 

  63. Vasilakos, A.V., Papadimitriou, G.: Ergodic discretized estimator learning automata with high accuracy and high adaptation rate for nonstationary environments. Neurocomputing 4, 181–196 (1992)

    MATH  Google Scholar 

  64. Vasilakos, A., Saltouros, M.P., Atlassis, A.F., Pedrycz, W.: Optimizing QoS routing in hierarchical ATM networks using computational intelligence techniques. IEEE Trans. Syst. Sci. Cybern. C, 33, 297–312 (2003)

    Google Scholar 

  65. Verbeeck, K., Nowe, A.: Colonies of learning automata. IEEE Trans. Syst. Man Cybern. B (Cybernetics) 32(6), 772–780 (2002)

    Google Scholar 

  66. Wang, Y., He, H., Tan, X.: Truly proximal policy optimization. In: Uncertainty in Artificial Intelligence, pp. 113–122. PMLR (2020)

    Google Scholar 

  67. Wheeler, R., Narendra, K.: Decentralized learning in finite Markov chains. IEEE Trans. Autom. Control 31(6), 519–526 (1986)

    MATH  Google Scholar 

  68. Wu, L., Feng, Z., Lam, J.: Stability and synchronization of discrete-time neural networks with switching parameters and time-varying delays. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 1957–1972 (2013)

    Google Scholar 

  69. Yazidi, A., Granmo, O.-C., Oommen, B.J., Goodwin, M.: A novel strategy for solving the stochastic point location problem using a hierarchical searching scheme. IEEE Trans. Syst. Man Cybern. SMC-44, 2202–2220 (2014)

    Google Scholar 

  70. Yazidi, A., Hassan, I., Hammer, H.L., Oommen, B.J.: Achieving fair load balancing by invoking a learning automata-based two-time-scale separation paradigm. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3444–3457 (2020)

    MathSciNet  Google Scholar 

  71. Yazidi, A., Zhang, X., Jiao, L., Oommen, B.J.: The hierarchical continuous pursuit learning automation: a novel scheme for environments with large numbers of actions. IEEE Trans. Neural Netw. Learn. Syst. 31, 512–526 (2019)

    MathSciNet  Google Scholar 

  72. Zhang, X., Jiao, L., Oommen, B.J., Granmo, O.-C.: A conclusive analysis of the finite-time behavior of the discretized pursuit learning automaton IEEE Trans. Neural Netw. Learn. Syst. 31, 284–294 (2019)

    Google Scholar 

  73. Zhang, J., Wang, Y., Wang, C., Zhou, M.: Symmetrical hierarchical stochastic searching on the line in informative and deceptive environments. IEEE Trans. Syst. Man Cybern. SMC-47, 626–635 (2017)

    Google Scholar 

  74. Zhang, X., Granmo, O.C., Oommen, B.J.: The Bayesian pursuit algorithm: a new family of estimator learning automata. In: Proceedings of IEAAIE2011. pp. 608–620. Springer, New York, (2011)

    Google Scholar 

  75. Zhang, X., Granmo, O.C., Oommen, B.J.: On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl. Intell. 39, 782–792 (2013)

    Google Scholar 

  76. Zhang, X., Granmo, O.C., Oommen, B.J., Jiao, L.: A formal proof of the 𝜖-optimality of absorbing continuous pursuit algorithms using the theory of regular functions. Appl. Intell. 41, 974–985 (2014)

    Google Scholar 

  77. Zhang, X., Oommen, B.J., Granmo, O.C., Jiao, L.: A formal proof of the 𝜖-optimality of discretized pursuit algorithms. Appl. Intell. (2015). https://doi.org/10.1007/s10489-015-0670-1

    Google Scholar 

  78. Wiener, N.: Cybernetics or Control and Communication in the Animal and the Machine. MIT Press, Cambridge (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. John Oommen .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Oommen, B.J., Yazidi, A., Misra, S. (2023). Cybernetics, Machine Learning, and Stochastic Learning Automata. In: Nof, S.Y. (eds) Springer Handbook of Automation. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-96729-1_10

Download citation

Publish with us

Policies and ethics