Associative Reinforcement Learning: Functions in k-DNF

Kaelbling, Leslie Pack

doi:10.1023/A:1022689909846

Associative Reinforcement Learning: Functions in k-DNF

Published: June 1994

Volume 15, pages 279–298, (1994)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Associative Reinforcement Learning: Functions in k-DNF

Download PDF

Leslie Pack Kaelbling¹

443 Accesses
17 Citations
Explore all metrics

Abstract

An agent that must learn to act in the world by trial and error faces the reinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for this problem in the general case, they are often quite inefficient and do not exhibit generalization. One strategy is to find restricted classes of action policies that can be learned more efficiently. This paper pursues that strategy by developing algorithms that can efficiently learn action maps that are expressible in k-DNF. The algorithms are compared with existing methods in empirical trials and are shown to have very good performance.

References

Anderson, C. W. (1986). Learning and Problem Solving with Multilayer Connectionist Systems. PhD thesis, University of Massachusetts, Amherst, Massachusetts.
Barto, A. G., Sutton, R. S., & Brouwer, P. S. (1981). Associative search network: A reinforcement learning associative memory. Biological Cybernetics, 40, 201–211.
Google Scholar
Berry, D. A. & Fristedt, B. (1985). Bandit Problems: Sequential Allocation of Experiments. London: Chapman and Hall.
Google Scholar
Enderton, H. B. (1972). A Mathematical Introduction to Logic. New York, New York: Academic Press.
Google Scholar
Gibbons, J. D. (1985). Nonparametric Statistical Inference. New York and Basel: Marcel Dekker, Inc.
Google Scholar
Gluck, M. A. (1991). Stimulus generalization and representation in adaptive network models of category learning. Psychological Science, 1 (1), 50–55.
CAS PubMed Google Scholar
Hertz, J., Krogh, A., & Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Redwood City, California: Addison Wesley.
Google Scholar
Kaelbling, L. P. (1994). Associative reinforcement learning: A generate and test algorithm. Machine Learning, 15, 299–319.
CAS PubMed Google Scholar
Kaelbling, L. P. (1993). Learning in Embedded Systems. Cambridge, Massachusetts: The MIT Press. Also available as a PhD Thesis from Stanford University, 1990.
Google Scholar
Larsen, R. J. & Marx, M. L. (1986). An Introduction to Mathematical Statistics and Its Applications. Englewood Cliffs, New Jersey: Prentice-Hall.
Google Scholar
Minsky, M. L. & Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry. Cambridge, Massachusetts: The MIT Press.
Google Scholar
Munro, P. (1987). A dual back-propagation scheme for scalar reward learning. In Proceedings of the Ninth Conference of the Cognitive Science Society (pp. 165–176). Seattle, Washington.
Narendra, K. & Thathachar, M. A. L. (1989). Learning Automata: An Introduction. Englewood Cliffs, New Jersey: Prentice-Hall.
Google Scholar
Rosenblatt, F. (1961). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington, DC: Spartan Press.
Google Scholar
Sutton, R. S. (1984). Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, Massachusetts.
Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3 (1), 9–44.
Google Scholar
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27 (11), 1134–1142.
Google Scholar
Valiant, L. G. (1985). Learning disjunctions of conjunctions. In Proceedings of the International Joint Conference on Artificial Intelligence, volume 1 (pp. 560–566). Los Angeles, California: Morgan Kaufmann.
Google Scholar
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis, King's College, Cambridge.
Werbos, P. J. (1988). Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1, 339–356.
Google Scholar
Widrow, B. & Hoff, M. E. (1960). Adaptive switching circuits. In IRE WESCON Convention Record New York, New York. Reprinted in Neurocomputing: Foundations of Research, James A. Anderson and Edward Rosenfeld, editors, The MIT Press, Cambridge, Massachusetts, 1988.
Williams, R. J. (1988). On the use of backpropagation in associative reinforcement learning. In Proceedings of the IEEE International Conference on Neural Networks San Diego, California.
Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8 (3), 229–256.
Article Google Scholar
Wolpert, D. H. (1993). On Overfitting Avoidance as Bias. Technical Report 93-03-016, Santa Fe Institute, Santa Fe, New Mexico.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Box 1910, Brown University, Providence, RI, 02912-1910, USA
Leslie Pack Kaelbling

Authors

Leslie Pack Kaelbling
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaelbling, L.P. Associative Reinforcement Learning: Functions in k-DNF. Machine Learning 15, 279–298 (1994). https://doi.org/10.1023/A:1022689909846

Download citation

Issue Date: June 1994
DOI: https://doi.org/10.1023/A:1022689909846

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Associative Reinforcement Learning: Functions in k-DNF

Abstract

Article PDF

Similar content being viewed by others

Reinforcement Learning Algorithms: Categorization and Structural Properties

From Reinforcement Learning to Deep Reinforcement Learning: An Overview

Reinforcement Learning: A Survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Associative Reinforcement Learning: Functions in k-DNF

Abstract

Article PDF

Similar content being viewed by others

Reinforcement Learning Algorithms: Categorization and Structural Properties

From Reinforcement Learning to Deep Reinforcement Learning: An Overview

Reinforcement Learning: A Survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation