Machine Learning

, Volume 5, Issue 4, pp 355–381 | Cite as

Learning sequential decision rules using simulation models and competition

  • John J. Grefenstette
  • Connie Loggia Ramsey
  • Alan C. Schultz

Abstract

The problem of learning decision rules for sequential tasks is addressed, focusing on the problem of learning tactical decision rules from a simple flight simulator. The learning method relies on the notion of competition and employs genetic algorithms to search the space of decision policies. Several experiments are presented that address issues arising from differences between the simulation model on which learning occurs and the target environment on which the decision rules are ultimately tested.

Key words

Sequential decision rules competition-based learning genetic algorithms 

References

  1. Agre, P.E. & Chapman, D. (1987). Pengi: An implementation of a theory of activity. Proceedings Sixth National Conference on Artificial Intelligence (pp. 268–272).Google Scholar
  2. Antonisse, H.J. & Keller, K.S. (1987). Genetic operators for high-level knowledge representations. Proceedings of the Second International Conference Genetic Algorithms and Their Applications (pp. 69–76). Cambridge, MA: Erlbaum.Google Scholar
  3. Barto, A.G., Sutton, R.S. & Watkins, C.J.C.H. (1989). Learning and sequential decision making (COINS Technical Report. Amherst, MA: University of Massachusetts.Google Scholar
  4. Bickel, A.S. & Bickel, R.W. (1987). Tree structured rules in genetic algorithms. Proceedings of the Second International Conference Genetic Algorithms and Their Applications (pp. 77–81). Cambridge, MA: Erlbaum.Google Scholar
  5. Booker, L.B. (1982). Intelligent behavior as adaptation to the task environment. Doctoral dissertation, Department of Computer and Communications Sciences, University of Michigan, Ann Arbor, Ann Arbor, MI.Google Scholar
  6. Booker, L.B. (1985). Improving the performance of genetic algorithms in classifier systems. Proceedings of the International Conference Genetic Algorithms and Their Applications (pp. 80–92). Pittsburgh, PA.Google Scholar
  7. Booker, L.B. (1988). Classifier systems that learn internal world models. Machine Learning, 3, 161–192.Google Scholar
  8. Buchanan, B.G., Sullivan, J., Cheng, T.P. & Clearwater, S.H. (1988). Simulation-assisted inductive learning. Proceedings Seventh National Conference on Artificial Intelligence. (pp. 552–557).Google Scholar
  9. Cramer, N.L. (1985). A representation for the adaptive generation of simple sequential programs. Proceedings of the International Conference Genetic Algorithms and Their Applications (pp. 183–187). Pittsburgh, PA.Google Scholar
  10. Davis, L. (1989). Adapting operator probabilities in genetic algorithms. Proceedings of the Third International Conference on Genetic Algorithms. (pp. 61–69). Fairfax, VA: Morgan Kaufmann.Google Scholar
  11. De Jong, K.A. (1975). Analysis of the behavior of a class of genetic adaptive systems. Doctoral dissertation, Department of Computer and Communications Sciences, University of Michigan, Ann Arbor, Ann Arbor, MI.Google Scholar
  12. Erickson, M.D. & Zytkow, J.M. (1988). Utilizing experience for improving the tactical manager. Proceedings of the Fifth International Conference on Machine Learning. (pp. 444–450). Ann Arbor, MI.Google Scholar
  13. Fitzpatrick, M.J. & J.J.Grefenstette (1988). Genetic algorithms in noisy environments, Machine Learning, 3, 101–120.Google Scholar
  14. Fujiki, C. & Dickinson, J. (1987). Using the genetic algorithm to generate LISP source code to solve the prisoner's dilemma. Proceedings of the Second International Conference Genetic Algorithms and Their Applications (pp. 236–240). Cambridge, MA: Erlbaum.Google Scholar
  15. Forbus, K.D. (1984). Qualitative process theory. Artificial Intelligence 24, 85–168.Google Scholar
  16. Goldberg, D.E. (1983). Computer-aided gas pipeline operation using genetic algorithms and machine learning. Doctoral dissertation, Department Civil Engineering, University of Michigan, Ann Arbor, Ann Arbor, MI.Google Scholar
  17. Goldberg, D.E. (1988). Probability matching, the magnitude of reinforcement, and classifier system bidding (Technical Report TCGA-88002). Tuscaloosa, AL: University of Alabama, Department of Engineering Mechanics.Google Scholar
  18. Goldberg, D.E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley.Google Scholar
  19. Gordon, D.F. & Grefenstette, J.J. (1990). Explanations of empirically derived reactive plans. Proceedings of the Seventh International Conference on Machine Learning. Austin, TX: Morgan Kaufmann.Google Scholar
  20. Grefenstette, J.J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, SMC 16(1), 122–128.Google Scholar
  21. Grefenstette, J.J. (1987). Incorporating problem specific knowledge into genetic algorithms. In L.Davis (ed.), Genetic algorithms and simulated annealing. London: Pitman Press.Google Scholar
  22. Grefenstette, J.J. (1988). Credit assignment in rule discovery system based on genetic algorithms. Machine Learning, 3, 225–245.Google Scholar
  23. Grefenstette, J.J. (1989). A system for learning control plans with genetic algorithms. Proceedings of the Third International Conference on Genetic Algorithms. (pp. 183–190). Fairfax, VA: Morgan Kaufmann.Google Scholar
  24. Grefenstette, J.J. & Baker, J.E. (1989). How genetic algorithms work: A critical look at implicit parallelism. Proceedings of the Third International Conference on Genetic Algorithms. (pp. 20–27). Fairfax, VA: Morgan Kaufmann.Google Scholar
  25. Holland, J.H. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University Michigan Press.Google Scholar
  26. Holland, J.H. (1975). Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In R.S.Michalski, J.G.Carbonell & T.M.Mitchell (Eds.), Machine learning: An artificial intelligence approach (Vol. 2). Los Altos, CA: Morgan Kaufmann.Google Scholar
  27. Koza, J.R. (1989). Hierarchical genetic algorithms operating on populations of computer programs. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 768–774). Detroit, MI: Morgan Kaufmann.Google Scholar
  28. Langley, P. (1983). Learning effective search heuristics. Proceedings of the Eighth International Joint Conference on Artificial Intelligence (pp. 419–421). Karlsruhe, Germany: Morgan Kaufmann.Google Scholar
  29. Michalski, R.S. (1983). A theory and methodology for inductive learning. Artificial Intelligence, 20, 111–161.Google Scholar
  30. Mitchell, T.M., Mahadevan, S. & Steinberg, L. (1985). LEAP: A learning apprentice for VLSI design. Proc. Ninth IJCAI, (pp. 573–580). Los Angeles: Morgan Kaufmann.Google Scholar
  31. Riolo, R.L. (1988). Empirical studies of default hierarchies and sequences of rules in learning classifier systems, Doctoral dissertation, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Ann Arbor, MI.Google Scholar
  32. Samuel, A.L. (1963). Some studies in machine learning using the game of checkers. In E.A. Feigenbaum & J. Feldman, (Eds.), Computer and Thought. McGraw-Hill.Google Scholar
  33. Schaffer, J.D. (1984). Some experiments in machine learning usng vector evaluated genetic algorithms, Doctoral dissertation, Department of Electrical and Biomedical Engineering, Vanderbilt University, Nashville, TN.Google Scholar
  34. Schaffer, J.D., Caruana, R.A., Eshelman, L.J. & Das, R. (1989). A study of control parameters affecting online performance of genetic algorithms for functionl optimization. Proceedings of the Third International Conference on Genetic Algorithms. (pp. 51–60). Fairfax, VA: Morgan Kaufmann.Google Scholar
  35. Selfridge, O., Sutton, R.S. & Barto, A.G. (1985). Training and tracking in robotics. Proceedings of the Ninth International Conference on Artificial Intelligence. Los Angeles, CA.Google Scholar
  36. Smith, S.F. (1980). A learning system based on genetic adaptive algorithms. Doctoral dissertation, Department of Computer Science, University of Pittsburgh, Pittsburgh, PA.Google Scholar
  37. Sutton, R.S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3, 9–44.Google Scholar
  38. Wilson, S.W. (1985). Knowledge growth in an artificial animal. Proceedings of the International Conference Genetic Algorithms and Their Applications (pp. 16–23). Pittsburgh, PA.Google Scholar
  39. Wilson, S.W. (1987). Classifier systems and the animat problem. Machine Learning, 2, 199–228.Google Scholar

Copyright information

© Kluwer Academic Publishers 1990

Authors and Affiliations

  • John J. Grefenstette
    • 1
  • Connie Loggia Ramsey
    • 1
  • Alan C. Schultz
    • 1
  1. 1.Naval Research LaboratoryNavy Center for Applied Research in Artificial IntelligenceWashington, DC

Personalised recommendations