Pattern Analysis and Applications

, Volume 20, Issue 3, pp 797–808 | Cite as

The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality

  • Xuan Zhang
  • B. John Oommen
  • Ole-Christoffer Granmo
Theoretical Advances
  • 85 Downloads

Abstract

The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA reported in the literature, the proofs of their \(\epsilon\)-optimal convergence has been unsolved. This is precisely the intent of this paper. In this paper, we present a single unifying analysis by which the proofs of both the continuous and discretized schemes are proven. We emphasize that unlike the ML-based pursuit schemes, the Bayesian schemes have to not only consider the estimates themselves but also the distributional forms of their conjugate posteriors and their higher order moments—all of which render the proofs to be particularly challenging. As far as we know, apart from the results themselves, the methodologies of this proof have been unreported in the literature—they are both pioneering and novel.

Keywords

Bayesian pursuit algorithms (BPA) Discretized Bayesian pursuit algorithms (DBPA) ε-optimality of LA Beta distribution 

References

  1. 1.
    Zhang X, Granmo OC, Oommen BJ (2011) The Bayesian pursuit algorithm: a new family of estimator learning automata. In: Proceedings of IEA-AIE 2011. Springer, New York, June 2011, pp 608–620Google Scholar
  2. 2.
    Zhang X, Granmo O-C, Oommen BJ (2013) On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl Intell 39:782–792CrossRefGoogle Scholar
  3. 3.
    Zhang X, Granmo OC, Oommen BJ (2012) Discretized Bayesian pursuit—a new scheme for reinforcement learning. In: Proceedings of IEA-AIE 2012, Dalian, June 2012, pp 784–793Google Scholar
  4. 4.
    Narendra KS, Thathachar MAL (1974) Learning automata—a survey. IEEE Trans Syst Man Cybern 4:323–334MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Narendra KS, Thathachar MAL (1989) Learning automata: an introduction. Prentice Hall, New Jersey, USAGoogle Scholar
  6. 6.
    Oommen BJ, Agache M (2001) Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Trans Syst Man Cybern Part B Cybern 31(3):277–287CrossRefGoogle Scholar
  7. 7.
    Oommen BJ, Granmo OC, Pedersen A (2007) Using stochastic AI techniques to achieve unbounded resolution in finite player Goore games and its applications. In: Proceedings of IEEE symposium on computational intelligence and games, Honolulu, April 2007, pp 161–167Google Scholar
  8. 8.
    Beigy H, Meybodi MR (2000) Adaptation of parameters of BP algorithm using learning automata. In: Proceedings of sixth Brazilian symposium on neural networks, Brazil, November 2000, pp 24–31Google Scholar
  9. 9.
    Zhang X, Jiao L, Granmo OC, Oommen BJ (2013) Channel selection in cognitive radio networks: a switchable Bayesian learning automata approach. In: Proceedings of PIMRC, London, September 2013, pp 2362–2367Google Scholar
  10. 10.
    Jiao L, Zhang X, Granmo OC, Oommen BJ (2014) A Bayesian learning automata-based distributed channel selection scheme for cognitive radio networks. In: Proceedings of IEA-AIE, Kaohsiung, June 2014, pp 48–57Google Scholar
  11. 11.
    Granmo O-C, Oommen BJ, Myrer S-A, Olsen MG (2007) Learning automata-based solutions to the nonlinear fractional knapsack problem with applications to optimal resource allocation. IEEE Trans Syst Man Cybern Part B 37(1):166–175CrossRefGoogle Scholar
  12. 12.
    Granmo OC, Oommen BJ, Myrer SA, Olsen MG (2006) Determining optimal polling frequency using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 IEEE international conferences on cybernetics and intelligent systems (CIS) and robotics, automation and mechatronics (RAM), Bangkok, June 2006, pp 1–7Google Scholar
  13. 13.
    Granmo O-C, Oommen BJ (2011) Learning automata-based solutions to the optimal web polling problem modeled as a nonlinear fractional knapsack problem. Eng Appl Artif Intell 24(7):1238–1251CrossRefGoogle Scholar
  14. 14.
    Granmo OC, Oommen BJ (2006) On allocating limited sampling resources using a learning automata-based solution to the fractional knapsack problem. In: Proceedings of the 2006 international intelligent information processing and web mining conference, advances in soft computing, vol 35, Ustron, June 2006, pp 263–272Google Scholar
  15. 15.
    Granmo O-C, Oommen BJ (2010) Optimal sampling for estimation with constrained resources using a learning automaton-based solution for the nonlinear fractional knapsack problem. Appl Intell 33(1):3–20CrossRefGoogle Scholar
  16. 16.
    Yazidi A, Granmo O-C, Oommen BJ (2012) Service selection in stochastic environments: a learning-automaton based solution. Appl Intell 36:617–637CrossRefGoogle Scholar
  17. 17.
    Unsal C, Kachroo P, Bay JS (1999) Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Trans Syst Man Cybern Part A 29:120–128CrossRefGoogle Scholar
  18. 18.
    Oommen BJ, Roberts TD (2000) Continuous learning automata solutions to the capacity assignment problem. IEEE Trans Comput 49:608–620CrossRefGoogle Scholar
  19. 19.
    Oommen BJ, Croix TDS (1997) String taxonomy using learning automata. IEEE Trans Syst Man Cybern 27:354–365CrossRefGoogle Scholar
  20. 20.
    Oommen BJ, Croix TDS (1996) Graph partitioning using learning automata. IEEE Trans Comput 45:195–208MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Dean T, Angluin D, Basye K, Engelson S, Aelbling L, Maron O (1995) Inferring finite automata with stochastic output functions and an application to map learning. Mach Learn 18:81–108Google Scholar
  22. 22.
    Thathachar MAL, Sastry PS (1986) Estimator algorithms for learning automata. In: Proceedings of the platinum jubilee conference on systems and signal processing, Bangalore, December 1986, pp 29–32Google Scholar
  23. 23.
    Oommen BJ, Lanctôt JK (1990) Discretized pursuit learning automata. IEEE Trans Syst Man Cybern 20:931–938MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Lanctôt JK, Oommen BJ (1992) Discretized estimator learning automata. IEEE Trans Syst Man Cybern Part B Cybern 22(6):1473–1483MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Lanctôt JK, Oommen BJ (1991) On discretizing estimator-based learning algorithms. IEEE Trans Syst Man Cybern Part B Cybern 2:1417–1422Google Scholar
  26. 26.
    Rajaraman K, Sastry PS (1996) Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans Syst Man Cybern Part B Cybern 26:590–598CrossRefGoogle Scholar
  27. 27.
    Martin R, Omkar T (2012) On \(\epsilon\)-optimality of the pursuit learning algorithm. J Appl Probab 49(3):795–805MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    Zhang X, Granmo OC, Oommen BJ, Jiao L (2013) On using the theory of regular functions to prove the \(\epsilon\)-optimality of the continuous pursuit learning automaton. In: Proceedings of IEA-AIE 2013. Springer, Amsterdan, June 2013, pp 262–271Google Scholar
  29. 29.
    Zhang X, Granmo O-C, Oommen BJ, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of absorbing continuous pursuit algorithms using the theory of regular functions. Appl Intell 41:974–985CrossRefGoogle Scholar
  30. 30.
    Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) Using the theory of regular functions to formally prove the \(\epsilon\)-optimality of discretized pursuit learning algorithms. In: Proceedings of IEA-AIE. Kaohsiung. Springer, June 2014, pp 379–388Google Scholar
  31. 31.
    Zhang X, Oommen BJ, Granmo OC, Jiao L (2014) A formal proof of the \(\epsilon\)-optimality of discretized pursuit algorithms. Appl IntellGoogle Scholar
  32. 32.
    Oommen BJ (1986) Absorbing and ergodic discretized two-action learning automata. IEEE Trans Syst Man Cybern 16:282–296MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Lanctôt JK, Oommen BJ (1992) Discretized estimator learning automata. IEEE Trans Syst Man Cybern 22(6):1473–1483MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  • Xuan Zhang
    • 1
  • B. John Oommen
    • 1
    • 2
  • Ole-Christoffer Granmo
    • 1
  1. 1.Department of ICTUniversity of AgderGrimstadNorway
  2. 2.School of Computer ScienceCarleton UniversityOttawaCanada

Personalised recommendations