Abstract
Empirical tests indicate that at least one class of genetic algorithms yields good performance for neural network weight optimization in terms of learning rates and scalability. The successful application of these genetic algorithms to supervised learning problems sets the stage for the use of genetic algorithms in reinforcement learning problems. On a simulated inverted-pendulum control problem, “genetic reinforcement learning” produces competitive results with AHC, another well-known reinforcement learning paradigm for neural networks that employs the temporal difference method. These algorithms are compared in terms of learning rates, performance-based generalization, and control behavior over time.
Article PDF
Similar content being viewed by others
References
Ackley, D., & Littman, M. (1990). Generalization and scaling in reinforcement learning. In D. Touretzky (Ed.),Advances in neural information processing systems (Vol. 2). San Mateo, CA: Morgan Kaufmann.
Ackley, D., & Littman, M. (1991)Interactions between learning and evolution. Morristown, NJ: Cognitive Science Research Group, Bellcore.
Anderson, C.W. (1987).Strategy learning with multilayer connectionist representations (TR87-509.3). GTE Labs, Waltham, MA.
Anderson, C.W. (1989). Learning to control an inverted pendulum using neural networks.IEEE Control Systems Magazine, 9 31–37.
Anderson, C.W., & Miller, W.T. (1990). A challenging set of control problems. In T. Miller, R. Sutton, & P. Werbos (Eds.),Neural networks for control. Cambridge, MA: MIT Press.
Barto, A.G. (1990). Connectionist learning for control. In T. Miller, R. Sutton, & P. Werbos (Eds.),Neural networks for control. Cambridge, MA: MIT Press.
Barto, A.G., Sutton, R.S., & Anderson, C.W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems.IEEE Transactions on Systems, Man, and Cybernetics, 13 834–846.
Booker, L., Goldberg, D., & Holland, J. (1989). Classifier systems and genetic algorithms.Artificial Intelligence, 40(1–3), 235–282.
Davis, L. (1989). Adapting operator probabilities in genetic search.Proceedings of the Third International Conference on Genetic Algorithms (pp. 61–69). Fairfax, VA: Morgan Kaufmann.
Davis, L. (1991).The handbook of genetic algorithms. New York: Van Nostrand Reinhold.
Fahlman, S., & Lebiere, C. (1990). The cascade correlation learning architecture. In D. Touretzky (Ed.),Advances in neural information processing systems (Vol. 2). San Mateo, CA: Morgan Kaufmann.
Fitzpatrick, J., & Grefenstette, J. (1988). Genetic algorithms in noisy environments.Machine Learning, 3(2–3), 101–120.
Goldberg, D. (1989).Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley.
Grefenstette, J. (1987). Incorporating problem specific knowledge into genetic algorithms. In L. Davis (Ed.),Genetic algorithms and simulated annealing. London: Pitman/Morgan Kaufmann.
Grefenstette, J. (1989). A system for learning control strategies using genetic algorithms.Proceedings of the Third International Conference on Genetic Algorithms (p. 183–190). Fairfax, VA: Morgan Kaufmann.
Grefenstette, J. (1991). Strategy acquisition with genetic algorithms. In L. Davis (Ed.),Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold.
Grefenstette, J.J., Ramsey, C.L., & Scultz, A.C. (1990). Learning sequential decision rules using simulation models and competition.Machine Learning, 5 355–381.
Harp, S., Samad, T., & Guha, A. (1990). Designing application-specific neural networks using the genetic algorithm.Neural Information Processing Systems (Vol. 2). San Mateo, CA: Morgan Kaufman.
Holland, J. (1975)Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press.
Holland, J. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In R. Michalski, J. Carbonell, & T. Mitchell (Eds.),Machine learning (Vol. 2). San Mateo, CA: Morgan Kaufmann.
Michie, D., & Chambers, R. (1968). BOXES: An experiment in adaptive control. In E. Dale & D. Michie (Eds.),Machine intelligence (Vol. 2). Edinburgh: Oliver and Boyd.
Miller, G., Todd, P., & Hegde, S. (1989). Designing neural networks using genetic algorithms.Proceedings of the Third International Conference on Genetic Algorithms (pp. 379–384). Fairfax, VA: Morgan Kaufmann.
Montana, D., & Davis, L. (1989). Training feedforward neural networks using genetic algorithms.Proceedings of the 1989 International Joint Conference on Artificial Intelligence (pp. 762–767).
Odetayo, M.O., & McGregor, D.R. (1989). Genetic algorithm for inducing control rules for a dynamic system.Proceedings of the Third International Conference on Genetic Algorithms (pp. 177–182). Fairfax, VA: Morgan Kaufmann.
Sammut, C., & Cribb, J. (1990). Is learning rate a good performance criterion for learning.Machine Learning: Proceedings of the 7th International Conference (pp. 170–178). San Mateo, CA: Morgan Kaufmann.
Schaffer, D. (1987). Some effects of selection procedures on hyperplane sampling by genetic algorithms. In L. Davis (Ed.),Genetic algorithms and simulated annealing. London: Pitman/Morgan Kaufmann.
Schaffer, J.D., Caruana, R.A., & Eshelman, L.J. (1990). Using genetic search to exploit the emergent behavior of neural networks.Physica D, 42 244–248.
Selfridge, O.G., Sutton, R.S., & Barto, A.G. (1988). Training and tracking in robotics.Proceedings of the Fifth International Conference on Machine Learning (pp. 437–443). San Mateo, CA: Morgan Kaufmann.
Sietsma, J., & Dow, R. (1991). Creating artificial neural networks that generalize.Neural Networks, 4 67–79.
Sutton, R. (1988). Learning to predict by the methods of temporal differences.Machine Learning, 3 9–44.
Sutton, R. (1991). Reinforcement learning architectures for animats. In J. Meyers & S. Wilson (Eds.),Simulation of adaptive behavior: From animals to animats (pp. 288–296). Cambridge, MA: MIT Press.
Thierens, D., & Vercauteren, L. (1990). A topology exploiting genetic algorithms to control dynamical systems. In H.P. Schwefel & R. Manners (Eds.),Parallel problems solving from nature (pp. 104–108). Springer/Verlag.
Watkins, C. (1990).Learning with delayed rewards. Ph.D. dissertation, Psychology Department, Cambridge University, Cambridge, England.
Weiland, A. (1990). Evolving controllers for unstable systems. In D. Touretzky, J. Elman, T. Sejnowski & G. Hinton (Eds.),Connectionist models: Proceedings of the 1990 Summer School (p. 91–102). San Mateo, CA: Morgan Kaufmann
Weiland, A. (1991). Evolving neural network controllers for unstable systems.1991 International Joint Conference on Neural Networks, 2 (pp. 667–673). Seattle.
Werbos, P. (1989). Backpropagation and neurocontrol: A review and prospectus.1989 International Joint Conference on Neural Networks, 1 (pp. 209–215). Washington, DC.
Werbos, P. (1990). A menu of designs for reinforcement learning over time. In T. Miller, R. Sutton, & P. Werbos (Eds.),Neural networks for control. Cambridge, MA: MIT Press.
Whitley, D., & Kauth, K. (1988). GENITOR: A different genetic algorithm.Proceedings of the 1988 Rocky Mountain Conference on Artificial Intelligence (pp. 118–130). Denver.
Whitley, D., & Hanson, T. (1989). Optimizing neural nets using faster, more accurate genetic search.Proceedings of the Third International Conference on Genetic Algorithms (pp. 391–396). Fairfax, VA: Morgan Kaufmann.
Whitley, D., & Bogart, C. (1990a). The evolution of connectivity: Pruning neural networks using genetic algorithms.1990 International Joint Conference on Neural Networks, 1 (p. 134–137). Washington, DC.
Whitley, D., & Starkweather, T. (1990b). Optimizing small neural networks using a distributed genetic algorithm.1990 International Joint Conference on Neural Networks, 1 (pp. 206–209). Washington, DC.
Whitley, D., Starkweather, T., & Bogart, C. (1990c). Genetic algorithm and neural networks: Optimizing connections and connectivity.Parallel Computing, 14 347–361.
Whitley, D., Dominic, S., & Das, R. (1991). Genetic reinforcement learning with multilayered neural networks. In R. Belew & L. Booker (Eds.),Proceedings of the 4th International Conference on Genetic Algorithms (pp. 562–569). San Diego, CA: Morgan Kaufmann.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Whitley, D., Dominic, S., Das, R. et al. Genetic reinforcement learning for neurocontrol problems. Mach Learn 13, 259–284 (1993). https://doi.org/10.1007/BF00993045
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00993045