An Effective Rule Based Policy Representation and its Optimization using Inter Normal Distribution Crossover
GA has an advantage that it can treat target functions without depending on their forms. Recently, many studies have been conducted on applying GA into the policy search. Especially, approaches using a rule based policy with Gaussian Mixture are promising. There is not, however, any genetic operator to create a new normal distribution from plural ones. We propose an effective policy representation and a new genetic operator INDX for it. The performance of the proposed method is shown, applying it to two benchmark problems, the Mountain Car and the Cart-Pole Swing up task.
Unable to display preview. Download preview PDF.
- Barto, A. G., Sutton, R. S. and Anderson, C. W.: Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems, IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-13, No.5, September/October 1983, pp.834–846.Google Scholar
- Holland, J. H.: Escaping Brittleness: the Possibilities of General-purpose Learning Algorithms Applied to Parallel Rule-Based Systems, Machine Learning, an artificial intelligence approach, Vol.2, pp.593–623 (1986).Google Scholar
- Ikeda, K.: Genetic policy search using exemplar based representations, The 8th Asia Pacific Symposium on Intelligent and Evolutionary Systems, pp.83–92 (2004).Google Scholar
- Ono, I. and Kobayashi, S.: A Real-coded Genetic Algorithm for Function Optimization Using Unimodal Normal Distribution Crossover, in Proc. 7th ICGA, 246–253 (1997).Google Scholar
- Satoh, H., Yamamura, M. and Kobayashi, S.: Minimal Generation Gap Model for GAs considering Both Exploration and Exploitation, Proceedings of IIZUKA’96, 494–497 (1996).Google Scholar
- Sutton, R. S. & Barto, A.: Reinforcement Learning: An Introduction, A Bradford Book, The MIT Press (1998).Google Scholar
- Tsuchiya, C., Kimura, H., Kobayashi, S.: Policy Learning by GA using Importance Sampling, The 8th Conference on Intelligent Autonomous Systems (IAS-8), 385–394 (2004).Google Scholar