Chapter 11 suggested methods to improve the rate of learning of the BOXES algorithm. This chapter expands upon the suggested structural alteration to the methodology to include the use of advisor cells which are defined as those that are positioned close to active cells in the state space. If the cumulative value of their advice outweighs the cell’s own natural strength, an advisor enhanced decision is allowed to prevail. If the advice is weak, the natural value is asserted. In either case, during the post-processing phase, the advisor cells are statistically rewarded or penalized based on their contribution or suggestions. This chapter discusses two advising schema and an improvement to either one that involves delaying the advising process until the system has gained some maturity. To illustrate the method results obtained from simulations of an inverted pendulum are included which show the profound impact that advisor logic has on the learning performance of the familiar learning algorithm.
- 1.Russell, D.W. and Rees, S.J. 1975. System Control—a Case Study of a Statistical Learning Automaton. Progress in Cybernetics Systems Research 2:114–120, New York: Hemisphere Publishing CoGoogle Scholar
- 2.Rozonoer, L. (1959). Pontriagin’s Maximum Principle in its Application to the Theory of Optimal Systems. Automation Remote Control Vol.20Google Scholar
- 3.Castillo, E. & Alvarez, E. (1991) Expert Systems: Uncertainty and Learning. Elsevier Applied Science. New York p.177Google Scholar
- 5.Russell, D.W., Super-boxes - an Accountable Advisor System (1993) Proc. AIENG’93 8th Intl. Conf. on Applications of AI in Engineering, Toulouse, France. Elsevier Press, 477–489Google Scholar
- 6.Russell, DW (1995) Advisor Enhanced Real Time System Control. Control Engineering Practice. Vol. 7: No.3 1995 Pergamon Press, 977–983Google Scholar