Reinforcement learning of iterative behaviour with multiple sensors

Piggott, Pushkar; Sattar, Abdul

doi:10.1007/BF00872474

Reinforcement learning of iterative behaviour with multiple sensors

Published: October 1994

Volume 4, pages 351–365, (1994)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Pushkar Piggott¹ &
Abdul Sattar²

56 Accesses
5 Citations
Explore all metrics

Abstract

Reinforcement learning allows an agent to be both reactive and adaptive, but it requires a simple yet consistent representation of the task environment. In robotics this representation is the product of perception. Perception is a powerful simplifying mechanism because it ignores much of the complexity of the world by mapping multiple world states to each of a few representational states. The constraint of consistency conflicts with simplicity, however. A consistent representation distinguishes world states that have distinct utilities, but perception systems with sufficient acuity to do this tend to also make many unnecessary distinctions.

In this paper we discuss reinforcement learning and the problem of appropriate perception. We then investigate a method for dealing with the problem, called theLion algorithm [1], and show that it can be used to reduce complexity by decomposing perception. The Lion algorithm does not allow iterative rules to be learned, and we describe modifications that overcome this limitation. We present experimental results that demonstrate their effectiveness in further reducing complexity. Finally, we mention some related research, and conclude with suggestions for further work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Stephen D. Whitehead and Dana H. Ballard, “Learning to perceive and act by trial and error”,Machine Learning, 7(1):45–83, July 1991.
Google Scholar
David Chapman, “Planning for conjunctive goals,”Artificial Intelligence, 32:333–377, 1987.
Article Google Scholar
Lonnie Chrisman and Reid Simmons, “Sensible planning: Focusing perceptual attention,” inProc. 9th Nat. (USA) Conf. on AI, pages 756–761, Menlo Park, CA, July 1991. MIT Press.
Google Scholar
R. James Firby, “An investigation into reactive planning in complex domains,” inProc. 6th Nat. Conf. on Artificial Intelligence, pages202–206, San Mateo, CA, July 1987. Morgan Kaufmann.
Google Scholar
Michael P. Georgeffand Amy L. Lansky, “Reactive reasoning and planning,” inProc. 6th Nat. Conf. on Artificial Intelligence, pages677–682, San Mateo, CA, July 1987. Morgan Kaufmann.
Google Scholar
Philip E. Agre and David Chapman, “Pengi: An implementation of a theory of action,” inProc. 6th Nat. (USA) Conf. on AI, pages 268–272, San Mateo, CA, July 1987. Morgan Kaufmann.
Google Scholar
Rodney A. Brooks, “A robust layered control system for a mobile robot,”IEEE Journal of Robotics and Automation, RA-2(1):14–23, March 1986.
Google Scholar
Stanley J. Rosenschein, “Formal theories of knowledge in AI and robotics,”New Generation Computing, 3(4):345–357, 1985.
Google Scholar
Rodney A. Brooks, “Intelligence without reason,” inProc. 12th Int. Joint Conf. on AI, pages 569–595, San Mateo, CA, 1991. Morgan Kaufmann.
Google Scholar
Rodney A. Brooks, “Elephants don't play chess,” in Pattie Maes, editor,Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back, pages 3–15. MIT Press, Cambridge, MA, 1990.
Google Scholar
Rodney A. Brooks, “Challenges for complete creature architectures,” in Jean-Arcady Meyer and Stewart W. Wilson, editors,From Animals to Animats: Proc. 1st Int. Conf. on Simulation of Adaptive Behaviour, pages 434–443, Cambridge, MA, 1991, MIT Press.
Google Scholar
Sridhar Mahadevan and Jonathan Connell, “Automatic programming of behaviour-based robots using reinforcementlearning,”Artificial Intelligence, 55:311–365, 1992.
Google Scholar
Doyle J, “Rationality and its role in reasoning,” inProc. 8th Nat. (USA) Conf. on AI, AAAI-90, pages 1093–1100, Cambridge, MA, 1990. MIT Press.
Google Scholar
Pattie Maes and Rodney A. Brooks, “Learning to coordinate behaviours,” inProc. 8th Nat. (USA) Conf. on AI, pages 796–802, Menlo Park, CA, July 1990. MIT Press.
Google Scholar
Richard S. Sutton, “Learning to predict by the methods of temporal difference,”Machine Learning, 3:9–43, 1988.
Google Scholar
Christopher J. C. H. Watkins, “Learning from Delayed Rewards,” PhD thesis, Department of Computer Science, University of Cambridge, Cambridge, U.K., 1989.
Google Scholar
Richard Bellman and Stuart Dreyfus, “Applied Dynamic Programming,” Princeton University Press, Princeton, NJ, 1962.
Google Scholar
Christopher J. C. H. Watkins and Peter Dayan, “Q-learning,”Machine Learning, 8(3/4):279–292, 1991.
Google Scholar
Douglas B. Lenat and John Seely Brown, “Why AM and EURISKO appear to work,”Artificial Intelligence, 23:269–294, 1984.
Google Scholar
Matthew T. Mason, “Kicking the sensing habit,”AI Magazine, 14(1):58–59, 1993.
Google Scholar
David P. Millar, “A twelve step program to more efficient robots,”AI Magazine, 14(1):60–63, 1993.
Google Scholar
Long-Ji Lin, “Self-improving reactive agents based on reinforcement learning, planning and teaching,”Machine Learning, 8(3/4):293–321, 1991.
Google Scholar
Ming Tan, “Cost-sensitive reinforcement learning for adaptive classification and control,” inProc. 9th Nat. Conf. on AI, pages 774–780, Menlo Park, CA, July 1991. MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Wollongong, Northfields Avenue, 2522, Wollongong, Australia
Pushkar Piggott
School of Computing and Information Technology, Griffith University, Nathan, 4111, Brisbane, Australia
Abdul Sattar

Authors

Pushkar Piggott
View author publications
You can also search for this author in PubMed Google Scholar
Abdul Sattar
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Piggott, P., Sattar, A. Reinforcement learning of iterative behaviour with multiple sensors. Appl Intell 4, 351–365 (1994). https://doi.org/10.1007/BF00872474

Download citation

Received: 17 February 1994
Revised: 16 March 1994
Issue Date: October 1994
DOI: https://doi.org/10.1007/BF00872474

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning of iterative behaviour with multiple sensors

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

What an Algorithm Is

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reinforcement learning of iterative behaviour with multiple sensors

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

What an Algorithm Is

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation