Lifelong Robot Learning

Thrun, Sebastian; Mitchell, Tom M.

doi:10.1007/978-3-642-79629-6_7

Sebastian Thrun² &
Tom M. Mitchell³

Part of the book series: NATO ASI Series ((NATO ASI F,volume 144))

284 Accesses
35 Citations

Abstract

Learning provides a useful tool for the automatic design of autonomous robots. Recent research on learning robot control has predominantly focussed on learning single tasks that were studied in isolation. If robots encounter a multitude of control learning tasks over their entire lifetime there is an opportunity to transfer knowledge between them. In order to do so, robots may learn the invariants and the regularities of their individual tasks and environments. This task-independent knowledge can be employed to bias generalization when learning control, which reduces the need for real-world experimentation. We argue that knowledge transfer is essential if robots are to learn control with moderate learning times in complex scenarios. Two approaches to lifelong robot learning which both capture invariant knowledge about the robot and its environments are presented. Both approaches have been evaluated using a HERO-2000 mobile robot. Learning tasks included navigation in unknown indoor environments and a simple find-and-fetch task.

This paper is also available as Technical report IAI-TR-93-7, University of Bonn, Dept. of Computer Science III, March 1993.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Christopher A. Atkeson, 1991. Using locally weighted regression for robot learning. In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, pp. 958–962, Sacramento, CA
Chapter Google Scholar
Jonathan R. Bachrach and Michael C. Mozer, 1991. Connectionist modeling and control of finite state systems given partial state information
Google Scholar
Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1989. Learning and sequential decision making. Technical Report COINS 89–95, Department of Computer Science, University of Massachusetts, MA
Google Scholar
Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1990. Learning and sequential decision making. In M. Gabriel and J. W. Moore, (eds.), Learning and Computational Neuroscience, pp. 539–602, Cambridge, MA. MIT Press
Google Scholar
Andrew G. Barto, Steven J. Bradtke, and SatinderP. Singh, 1991. Real-time learning and control using asynchronous dynamic programming. Technical Report COINS 91–57, Department of Computer Science, University of Massachusetts, MA
Google Scholar
R. E. Bellman, 1957. Dynamic Programming. Princeton University Press, Prince–ton, NJ
Google Scholar
Rodney A. Brooks, 1989. A robot that walks; emergent behaviors from a carefully evolved network. Neural Computation, 1 (2): 253
Article Google Scholar
Joachim Buhmann, Wolfram Burgard, Armin B. Cremers, Dieter Fox, Thomas Hofmann, Frank Schneider, Jiannis Strikos, and Sebastian Thrun, 1995. The mobile robot Rhino. AI Magazine, 16 (1)
Google Scholar
Tom Bylander, 1991. Complexity results for planning. In Proceedings of IJCAI-91, pp. 274–279, Darling Habour, Sydney, Australia. IJCAI, Inc
Google Scholar
John Canny, 1987. The Complexity of Robot Motion Planning. MIT Press, Cambridge, MA
Google Scholar
Richard Caruana, 1993. Multitask learning: A knowledge-based of source of inductive bias. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 41–48, San Mateo, CA. Morgan Kaufmann
Google Scholar
Lonnie Chrisman, 1992. Reinforcement learning with perceptual aliasing: The perceptual distinction approach. In Proceedings of 1992 AAAI Conference, Menlo Park, CA. AAAI Press/MIT Press
Google Scholar
Peter Day an and Geoffrey E. Hinton, 1993. Feudal reinforcement learning. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann
Google Scholar
Gerald DeJong and Raymond Mooney, 1986. Explanation-based learning: An alternative view. Machine Learning, 1 (2): 145–176
Google Scholar
Alberto Elfes, 1987. Sonar-based real-world mapping and navigation. IEEE Journal of Robotics and Automation, RA-3(3): 249–265
Google Scholar
Michael I. Jordan, 1989. Generic constraints on underspecified target trajectories. In Proceedings of the First International Joint Conference on Neural Networks, Washington, DC, San Diego. IEEE TAB Neural Network Committee R.E. Kaiman, 1960. A new approach to linear filtering and prediction problems. Trans. ASME, Journal of Basic Engineering, 82: 35–45
Google Scholar
Benjamin Kuipers and Yung-Tai Byun, 1990. A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations. Technical report, Department of Computer Science, University of Texas at Austin, TX 78712
Google Scholar
Long-Ji Lin and Tom M. Mitchell, 1992. Memory approaches to reinforcement learning in non-markovian domains. Technical Report CMU-CS-92–138, Carnegie Mellon University, Pittsburgh, PA
Google Scholar
Long-Ji Lin, 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8
Google Scholar
Long-Ji Lin, 1992. Self-supervised Learning by Reinforcement and Artificial Neural Networks. PhD thesis, Carnegie Mellon University, School of Computer Science, Pittsburgh, PA
Google Scholar
Pattie Maes, (ed.), 1991. Designing Autonomous Agents. MIT Press (and Elsevier), Cambridge, MA
Google Scholar
Sridhar Mahadevan and Jonathan Connell, 1991. Scaling reinforcement learning to robotics by exploiting the subsumption architecture. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 328–332
Google Scholar
Bartlett W. Mel, 1989. Murphy: A neutrally-inspired connectionist approach to learning and performance in vision–based robot motion planning. Technical Report CCSR–89–17A, Center for Complex Systems Research Beckman Institute, University of Illinois
Google Scholar
Tom M. Mitchell and Sebastian Thrun, 1993. Explanation based learning: A comparison of symbolic and neural network approaches. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 197–204, San Mateo, CA. Morgan Kaufmann
Google Scholar
Tom M. Mitchell and Sebastian Thrun, 1993. Explanation-based neural network learning for robot control. In S. J. Hanson, J. Cowan, and C. L. Giles, (eds.), Advances in Neural Information Processing Systems 5, pp. 287–294, San Mateo, CA. Morgan Kaufmann
Google Scholar
Tom M. Mitchell and Sebastian Thrun, 1995. Learning analytically and inductively. In Steier and Mitchell, (eds.), Mind Matters: A Tribute to Allen Newell. Lawrence Erlbaum Associates
Google Scholar
Tom M. Mitchell, Rich Keller, and Smadar Kedar-Cabelli, 1986. Explanation-based generalization: A unifying view. Machine Learning, 1 (1): 47–80
Google Scholar
Tom M. Mitchell, Joseph OSullivan, and Sebastian Thrun, 1994. Explanation- based learning for mobile robot perception. In Workshop on Robot Learning, Eleventh Conference on Machine Learning
Google Scholar
Andrew W. Moore, 1990. Efficient Memory-based Learning for Robot Control.PhD thesis, Trinity Hall, University of Cambridge, UK
Google Scholar
Hans P. Moravec, 1988. Sensor fusion in certainty grids for mobile robots. AI.Magazine, pp. 61–74
Google Scholar
Michael C. Mozer and Jonathan R. Bachrach, 1989. Discovering the structure of a reactive environment by exploration. Technical Report CU-CS-451–89, Dept. of Computer Science, University of Colorado, Boulder
Google Scholar
Paul Munro, 1987. A dual backpropagation scheme for scalar-reward learning. In Ninth Annual Conference of the Cognitive Science Society, pp. 165–176, Hillsdale, NJ. Cognitive Science Society, Lawrence Erlbaum
Google Scholar
Joseph O’Sullivan, Tom M. Mitchell, and Sebastian Thrun, 1995. Explanation- based neural network learning from mobile robot perception. In Katsushi Ikeuchi and Manuela Veloso, (eds.), Symbolic Visual Learning. Oxford University Press
Google Scholar
Dean A. Pomerleau, 1989. ALVINN: an autonomous land vehicle in a neural network. Technical Report CMU-CS-89–107, Computer Science Dept. Carnegie Mellon University, Pittsburgh PA
Google Scholar
Lorien Y. Pratt, 1993. Discriminability-based transfer between neural networks. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann
Google Scholar
Ronald L. Rivest and Robert E. Schapire, 1987. Diversity-based inference of finite automata. In Proceedings of Foundations of Computer Science
Google Scholar
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, 1986. Learning internal representations by error propagation. In D. E. Rumelhart and J. L. McClelland, (eds.), Parallel Distributed Processing. Vol. I II. MIT Press
Google Scholar
A. L. Samuel, 1959. Some studies in machine learning using the game of checkers.IBM Journal on research and development, 3: 210–229
Google Scholar
Jacob T. Schwartz, Micha Scharir, and John Hopcroft, 1987. Planning, Geometry and Complexity of Robot Motion. Ablex Publishing Corporation, Norwood, NJ
Google Scholar
Noel E. Sharkey and Amanda J.C. Sharkey, 1992. Adaptive generalization and the transfer of knowledge. In Proceedings of the Second Irish Neural Networks Conference, Belfast
Google Scholar
Patrice Simard, Bernard Victorri, Yann LeCun, and John Denker, 1992. Tangent prop–a formalism for specifying selected invariances in an adaptive network. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 895–903, San Mateo, CA. Morgan Kaufmann
Google Scholar
Satinder P. Singh, 1992. The efficient learning of multiple task sequences. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 251–258, San Mateo, CA. Morgan Kaufmann
Google Scholar
Satinder P. Singh, 1992. Transfer of learning by composing solutions for elemental sequential tasks. Machine Learning, 8
Google Scholar
Steven C. Suddarth and Y. L. Kergosien, 1990. Ruleinjection hints as a means of improving network performance and learning time. In Proceedings of the EURASIP Workshop on Neural Networks, Sesimbra, Portugal. EURASIP
Google Scholar
Richard S. Sutton, 1984. Temporal Credit Assignment in Reinforcement Learning. PhD thesis, Department of Computer and Information Science, University of Massachusetts
Google Scholar
Richard S. Sutton, 1988. Learning to predict by the methods of temporal differ–ences. Machine Learning, 3
Google Scholar
Richard S. Sutton, 1990. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, June 1990, pp. 216– 224, San Mateo, CA. Morgan Kaufmann
Google Scholar
Richard S. Sutton, 1992. Adapting bias by gradient descent: An incremental version of delta-bar-delta. In Proceeding of Tenth National Conference on Artificial Intelligence AAAI-92, pp. 171–176, Menlo Park, CA. AAAI, AAAI Press/MIT Press
Google Scholar
Ming Tan, 1991. Learning a cost-sensitive internal representation for reinforcement learning. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 358–362
Google Scholar
Sebastian Thrun and Tom M. Mitchell, 1993. Integrating inductive neural network learning and explanation-based learning. In Proceedings of IJCAI-93, Chamberry, France. IJCAI, Inc
Google Scholar
Sebastian Thrun and Tom M. Mitchell, 1994. Learning one more thing. Technical Report CMU-CS-94–184, Carnegie Mellon University, Pittsburgh, PA 15213 Sebastian Thrun, 1992. The role of exploration in learning control. In David A. White and Donald A. Sofge, (eds.), Handbook of intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand Reinhold, Florence, Kentucky 41022
Google Scholar
Sebastian Thrun, 1993. Exploration and model building in mobile robot domains. In Proceedings of the ICNN-93, pp. 175–180, San Francisco, CA. IEEE Neural Network Council
Google Scholar
Sebastian Thrun, 1994. A lifelong learning perspective for mobile robot control. In Proceedings of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems Sebastian Thrun, 1995. An approach to learning mobile robot navigation. Robotics and Autonomous Systems, (in press)
Google Scholar
Sebastian Thrun, 1995. Learning to play the game of chess. In G. Tesauro, D. Touretzky, and T. Leen, (eds.), Advances in Neural Information Processing Systems 7, San Mateo, CA. MIT Press
Google Scholar
Christopher J. C. H. Watkins, 1989. Learning from Delayed Rewa rds. PhD thesis, King’s College, Cambridge, UK
Google Scholar
Steven D. Whitehead and D H. Ballard, 1991. Learning to perceive and act by trial and error. Machine Learning, 7: 45–83
Google Scholar

Download references

Author information

Authors and Affiliations

Institut fur Informatik III, University of Bonn, Römerstr.. 164, 53117, Bonn, Germany
Sebastian Thrun
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Tom M. Mitchell

Authors

Sebastian Thrun
View author publications
You can also search for this author in PubMed Google Scholar
Tom M. Mitchell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Artificial Intelligence Laboratory, Department of Computer Science, University of Brussels (Vrije Universiteit Brussel), Pleinlaan 2, B-1050, Brussels, Belgium
Luc Steels

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thrun, S., Mitchell, T.M. (1995). Lifelong Robot Learning. In: Steels, L. (eds) The Biology and Technology of Intelligent Autonomous Agents. NATO ASI Series, vol 144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-79629-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-79629-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-79631-9
Online ISBN: 978-3-642-79629-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics