Advertisement

On Using the Theory of Regular Functions to Prove the ε-Optimality of the Continuous Pursuit Learning Automaton

  • Xuan Zhang
  • Ole-Christoffer Granmo
  • B. John Oommen
  • Lei Jiao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7906)

Abstract

There are various families of Learning Automata (LA) such as Fixed Structure, Variable Structure, Discretized etc. Informally, if the environment is stationary, their ε-optimality is defined as their ability to converge to the optimal action with an arbitrarily large probability, if the learning parameter is sufficiently small/large. Of these LA families, Estimator Algorithms (EAs) are certainly the fastest, and within this family, the set of Pursuit algorithms have been considered to be the pioneering schemes. The existing proofs of the ε-optimality of all the reported EAs follow the same fundamental principles. Recently, it has been reported that the previous proofs for the ε-optimality of all the reported EAs have a common flaw. In other words, people have worked with this flawed reasoning for almost three decades. The flaw lies in the condition which apparently supports the so-called “monotonicity” property of the probability of selecting the optimal action, explained in the paper. In this paper, we provide a new method to prove the ε-optimality of the Continuous Pursuit Algorithm (CPA), which was the pioneering EA. The new proof follows the same outline of the previous proofs, but instead of examining the monotonicity property of the action probabilities, it rather examines their submartingale property, and then, unlike the traditional approach, invokes the theory of Regular functions to prove the ε-optimality. We believe that the proof is both unique and pioneering, and that it can form the basis for formally demonstrating the ε-optimality of other EAs.

Keywords

Pursuit Algorithms Continuous Pursuit Algorithm ε-optimality 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Oommen, B.J., Granmo, O.C., Pedersen, A.: Using stochastic AI techniques to achieve unbounded resolution in finite player goore games and its applications. In: IEEE Symposium on Computational Intelligence and Games, Honolulu, HI (2007)Google Scholar
  2. 2.
    Beigy, H., Meybodi, M.R.: Adaptation of parameters of bp algorithm using learning automata. In: Sixth Brazilian Symposium on Neural Networks, JR, Brazil (2000)Google Scholar
  3. 3.
    Granmo, O.C., Oommen, B.J., Myrer, S.A., Olsen, M.G.: Learning automata-based solutions to the nonlinear fractional knapsack problem with applications to optimal resource allocation. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 166–175 (2007)CrossRefGoogle Scholar
  4. 4.
    Unsal, C., Kachroo, P., Bay, J.S.: Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Transactions on Systems, Man, and Cybernetics, Part A 29, 120–128 (1999)CrossRefGoogle Scholar
  5. 5.
    Oommen, B.J., Roberts, T.D.: Continuous learning automata solutions to the capacity assignment problem. IEEE Transactions on Computers 49, 608–620 (2000)CrossRefGoogle Scholar
  6. 6.
    Granmo, O.C.: Solving stochastic nonlinear resource allocation problems using a hierarchy of twofold resource allocation automata. IEEE Transactions Computers 59(4), 545–560 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Oommen, B.J., Croix, T.D.S.: String taxonomy using learning automata. IEEE Transactions on Systems, Man, and Cybernetics 27, 354–365 (1997)CrossRefGoogle Scholar
  8. 8.
    Oommen, B.J., Croix, T.D.S.: Graph partitioning using learning automata. IEEE Transactions on Computers 45, 195–208 (1996)zbMATHCrossRefGoogle Scholar
  9. 9.
    Dean, T., Angluin, D., Basye, K., Engelson, S., Aelbling, L., Maron, O.: Inferring finite automata with stochastic output functions and an application to map learning. Maching Learning 18, 81–108 (1995)Google Scholar
  10. 10.
    Thathachar, M.A.L., Sastry, P.S.: Estimator algorithms for learning automata. In: The Platinum Jubilee Conference on Systems and Signal Processing, Bangalore, India, pp. 29–32 (1986)Google Scholar
  11. 11.
    Oommen, B.J., Lanctot, J.K.: Discretized pursuit learning automata. IEEE Transactions on Systems, Man, and Cybernetics 20, 931–938 (1990)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Lanctot, J.K., Oommen, B.J.: On discretizing estimator-based learning algorithms. IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 2, 1417–1422 (1991)Google Scholar
  13. 13.
    Lanctot, J.K., Oommen, B.J.: Discretized estimator learning automata. IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 22(6), 1473–1483 (1992)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Rajaraman, K., Sastry, P.S.: Finite time analysis of the pursuit algorithm for learning automata. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 26, 590–598 (1996)CrossRefGoogle Scholar
  15. 15.
    Oommen, B.J., Agache, M.: Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31(3), 277–287 (2001)CrossRefGoogle Scholar
  16. 16.
    Ryan, M., Omkar, T.: On ε-optimality of the pursuit learning algorithm. Journal of Applied Probability 49(3), 795–805 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Narendra, K.S., Thathachar, M.A.L.: Learning Automat: An Introduction. Prentice Hall (1989)Google Scholar
  18. 18.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30 (1963)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Zhang, X., Granmo, O.C., Oommen, B.J., Jiao, L.: A Formal Proof of the ε-Optimality of Continuous Pursuit Algorithms Using the Theory of Regular Functions. The Unabridged Version of this Paper (Submitted for Publication. It can be made available to the Referees if needed)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Xuan Zhang
    • 1
  • Ole-Christoffer Granmo
    • 1
  • B. John Oommen
    • 2
    • 1
  • Lei Jiao
    • 1
  1. 1.Dept. of ICTUniversity of AgderGrimstadNorway
  2. 2.School of Computer ScienceCarleton UniversityOttawaCanada

Personalised recommendations