Skip to main content

Part of the book series: NATO ASI Series ((NATO ASI F,volume 144))

Abstract

Learning provides a useful tool for the automatic design of autonomous robots. Recent research on learning robot control has predominantly focussed on learning single tasks that were studied in isolation. If robots encounter a multitude of control learning tasks over their entire lifetime there is an opportunity to transfer knowledge between them. In order to do so, robots may learn the invariants and the regularities of their individual tasks and environments. This task-independent knowledge can be employed to bias generalization when learning control, which reduces the need for real-world experimentation. We argue that knowledge transfer is essential if robots are to learn control with moderate learning times in complex scenarios. Two approaches to lifelong robot learning which both capture invariant knowledge about the robot and its environments are presented. Both approaches have been evaluated using a HERO-2000 mobile robot. Learning tasks included navigation in unknown indoor environments and a simple find-and-fetch task.

This paper is also available as Technical report IAI-TR-93-7, University of Bonn, Dept. of Computer Science III, March 1993.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Christopher A. Atkeson, 1991. Using locally weighted regression for robot learning. In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, pp. 958–962, Sacramento, CA

    Chapter  Google Scholar 

  • Jonathan R. Bachrach and Michael C. Mozer, 1991. Connectionist modeling and control of finite state systems given partial state information

    Google Scholar 

  • Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1989. Learning and sequential decision making. Technical Report COINS 89–95, Department of Computer Science, University of Massachusetts, MA

    Google Scholar 

  • Andrew G. Barto, Richard S. Sutton, and Chris J. C. H. Watkins, 1990. Learning and sequential decision making. In M. Gabriel and J. W. Moore, (eds.), Learning and Computational Neuroscience, pp. 539–602, Cambridge, MA. MIT Press

    Google Scholar 

  • Andrew G. Barto, Steven J. Bradtke, and SatinderP. Singh, 1991. Real-time learning and control using asynchronous dynamic programming. Technical Report COINS 91–57, Department of Computer Science, University of Massachusetts, MA

    Google Scholar 

  • R. E. Bellman, 1957. Dynamic Programming. Princeton University Press, Prince–ton, NJ

    Google Scholar 

  • Rodney A. Brooks, 1989. A robot that walks; emergent behaviors from a carefully evolved network. Neural Computation, 1 (2): 253

    Article  Google Scholar 

  • Joachim Buhmann, Wolfram Burgard, Armin B. Cremers, Dieter Fox, Thomas Hofmann, Frank Schneider, Jiannis Strikos, and Sebastian Thrun, 1995. The mobile robot Rhino. AI Magazine, 16 (1)

    Google Scholar 

  • Tom Bylander, 1991. Complexity results for planning. In Proceedings of IJCAI-91, pp. 274–279, Darling Habour, Sydney, Australia. IJCAI, Inc

    Google Scholar 

  • John Canny, 1987. The Complexity of Robot Motion Planning. MIT Press, Cambridge, MA

    Google Scholar 

  • Richard Caruana, 1993. Multitask learning: A knowledge-based of source of inductive bias. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 41–48, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Lonnie Chrisman, 1992. Reinforcement learning with perceptual aliasing: The perceptual distinction approach. In Proceedings of 1992 AAAI Conference, Menlo Park, CA. AAAI Press/MIT Press

    Google Scholar 

  • Peter Day an and Geoffrey E. Hinton, 1993. Feudal reinforcement learning. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Gerald DeJong and Raymond Mooney, 1986. Explanation-based learning: An alternative view. Machine Learning, 1 (2): 145–176

    Google Scholar 

  • Alberto Elfes, 1987. Sonar-based real-world mapping and navigation. IEEE Journal of Robotics and Automation, RA-3(3): 249–265

    Google Scholar 

  • Michael I. Jordan, 1989. Generic constraints on underspecified target trajectories. In Proceedings of the First International Joint Conference on Neural Networks, Washington, DC, San Diego. IEEE TAB Neural Network Committee R.E. Kaiman, 1960. A new approach to linear filtering and prediction problems. Trans. ASME, Journal of Basic Engineering, 82: 35–45

    Google Scholar 

  • Benjamin Kuipers and Yung-Tai Byun, 1990. A robot exploration and mapping strategy based on a semantic hierarchy of spatial representations. Technical report, Department of Computer Science, University of Texas at Austin, TX 78712

    Google Scholar 

  • Long-Ji Lin and Tom M. Mitchell, 1992. Memory approaches to reinforcement learning in non-markovian domains. Technical Report CMU-CS-92–138, Carnegie Mellon University, Pittsburgh, PA

    Google Scholar 

  • Long-Ji Lin, 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8

    Google Scholar 

  • Long-Ji Lin, 1992. Self-supervised Learning by Reinforcement and Artificial Neural Networks. PhD thesis, Carnegie Mellon University, School of Computer Science, Pittsburgh, PA

    Google Scholar 

  • Pattie Maes, (ed.), 1991. Designing Autonomous Agents. MIT Press (and Elsevier), Cambridge, MA

    Google Scholar 

  • Sridhar Mahadevan and Jonathan Connell, 1991. Scaling reinforcement learning to robotics by exploiting the subsumption architecture. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 328–332

    Google Scholar 

  • Bartlett W. Mel, 1989. Murphy: A neutrally-inspired connectionist approach to learning and performance in vision–based robot motion planning. Technical Report CCSR–89–17A, Center for Complex Systems Research Beckman Institute, University of Illinois

    Google Scholar 

  • Tom M. Mitchell and Sebastian Thrun, 1993. Explanation based learning: A comparison of symbolic and neural network approaches. In Paul E. Utgoff, (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 197–204, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Tom M. Mitchell and Sebastian Thrun, 1993. Explanation-based neural network learning for robot control. In S. J. Hanson, J. Cowan, and C. L. Giles, (eds.), Advances in Neural Information Processing Systems 5, pp. 287–294, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Tom M. Mitchell and Sebastian Thrun, 1995. Learning analytically and inductively. In Steier and Mitchell, (eds.), Mind Matters: A Tribute to Allen Newell. Lawrence Erlbaum Associates

    Google Scholar 

  • Tom M. Mitchell, Rich Keller, and Smadar Kedar-Cabelli, 1986. Explanation-based generalization: A unifying view. Machine Learning, 1 (1): 47–80

    Google Scholar 

  • Tom M. Mitchell, Joseph OSullivan, and Sebastian Thrun, 1994. Explanation- based learning for mobile robot perception. In Workshop on Robot Learning, Eleventh Conference on Machine Learning

    Google Scholar 

  • Andrew W. Moore, 1990. Efficient Memory-based Learning for Robot Control.PhD thesis, Trinity Hall, University of Cambridge, UK

    Google Scholar 

  • Hans P. Moravec, 1988. Sensor fusion in certainty grids for mobile robots. AI.Magazine, pp. 61–74

    Google Scholar 

  • Michael C. Mozer and Jonathan R. Bachrach, 1989. Discovering the structure of a reactive environment by exploration. Technical Report CU-CS-451–89, Dept. of Computer Science, University of Colorado, Boulder

    Google Scholar 

  • Paul Munro, 1987. A dual backpropagation scheme for scalar-reward learning. In Ninth Annual Conference of the Cognitive Science Society, pp. 165–176, Hillsdale, NJ. Cognitive Science Society, Lawrence Erlbaum

    Google Scholar 

  • Joseph O’Sullivan, Tom M. Mitchell, and Sebastian Thrun, 1995. Explanation- based neural network learning from mobile robot perception. In Katsushi Ikeuchi and Manuela Veloso, (eds.), Symbolic Visual Learning. Oxford University Press

    Google Scholar 

  • Dean A. Pomerleau, 1989. ALVINN: an autonomous land vehicle in a neural network. Technical Report CMU-CS-89–107, Computer Science Dept. Carnegie Mellon University, Pittsburgh PA

    Google Scholar 

  • Lorien Y. Pratt, 1993. Discriminability-based transfer between neural networks. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 5, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Ronald L. Rivest and Robert E. Schapire, 1987. Diversity-based inference of finite automata. In Proceedings of Foundations of Computer Science

    Google Scholar 

  • David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, 1986. Learning internal representations by error propagation. In D. E. Rumelhart and J. L. McClelland, (eds.), Parallel Distributed Processing. Vol. I II. MIT Press

    Google Scholar 

  • A. L. Samuel, 1959. Some studies in machine learning using the game of checkers.IBM Journal on research and development, 3: 210–229

    Google Scholar 

  • Jacob T. Schwartz, Micha Scharir, and John Hopcroft, 1987. Planning, Geometry and Complexity of Robot Motion. Ablex Publishing Corporation, Norwood, NJ

    Google Scholar 

  • Noel E. Sharkey and Amanda J.C. Sharkey, 1992. Adaptive generalization and the transfer of knowledge. In Proceedings of the Second Irish Neural Networks Conference, Belfast

    Google Scholar 

  • Patrice Simard, Bernard Victorri, Yann LeCun, and John Denker, 1992. Tangent prop–a formalism for specifying selected invariances in an adaptive network. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 895–903, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Satinder P. Singh, 1992. The efficient learning of multiple task sequences. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, (eds.), Advances in Neural Information Processing Systems 4, pp. 251–258, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Satinder P. Singh, 1992. Transfer of learning by composing solutions for elemental sequential tasks. Machine Learning, 8

    Google Scholar 

  • Steven C. Suddarth and Y. L. Kergosien, 1990. Ruleinjection hints as a means of improving network performance and learning time. In Proceedings of the EURASIP Workshop on Neural Networks, Sesimbra, Portugal. EURASIP

    Google Scholar 

  • Richard S. Sutton, 1984. Temporal Credit Assignment in Reinforcement Learning. PhD thesis, Department of Computer and Information Science, University of Massachusetts

    Google Scholar 

  • Richard S. Sutton, 1988. Learning to predict by the methods of temporal differ–ences. Machine Learning, 3

    Google Scholar 

  • Richard S. Sutton, 1990. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, June 1990, pp. 216– 224, San Mateo, CA. Morgan Kaufmann

    Google Scholar 

  • Richard S. Sutton, 1992. Adapting bias by gradient descent: An incremental version of delta-bar-delta. In Proceeding of Tenth National Conference on Artificial Intelligence AAAI-92, pp. 171–176, Menlo Park, CA. AAAI, AAAI Press/MIT Press

    Google Scholar 

  • Ming Tan, 1991. Learning a cost-sensitive internal representation for reinforcement learning. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 358–362

    Google Scholar 

  • Sebastian Thrun and Tom M. Mitchell, 1993. Integrating inductive neural network learning and explanation-based learning. In Proceedings of IJCAI-93, Chamberry, France. IJCAI, Inc

    Google Scholar 

  • Sebastian Thrun and Tom M. Mitchell, 1994. Learning one more thing. Technical Report CMU-CS-94–184, Carnegie Mellon University, Pittsburgh, PA 15213 Sebastian Thrun, 1992. The role of exploration in learning control. In David A. White and Donald A. Sofge, (eds.), Handbook of intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand Reinhold, Florence, Kentucky 41022

    Google Scholar 

  • Sebastian Thrun, 1993. Exploration and model building in mobile robot domains. In Proceedings of the ICNN-93, pp. 175–180, San Francisco, CA. IEEE Neural Network Council

    Google Scholar 

  • Sebastian Thrun, 1994. A lifelong learning perspective for mobile robot control. In Proceedings of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems Sebastian Thrun, 1995. An approach to learning mobile robot navigation. Robotics and Autonomous Systems, (in press)

    Google Scholar 

  • Sebastian Thrun, 1995. Learning to play the game of chess. In G. Tesauro, D. Touretzky, and T. Leen, (eds.), Advances in Neural Information Processing Systems 7, San Mateo, CA. MIT Press

    Google Scholar 

  • Christopher J. C. H. Watkins, 1989. Learning from Delayed Rewa rds. PhD thesis, King’s College, Cambridge, UK

    Google Scholar 

  • Steven D. Whitehead and D H. Ballard, 1991. Learning to perceive and act by trial and error. Machine Learning, 7: 45–83

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thrun, S., Mitchell, T.M. (1995). Lifelong Robot Learning. In: Steels, L. (eds) The Biology and Technology of Intelligent Autonomous Agents. NATO ASI Series, vol 144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-79629-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-79629-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-79631-9

  • Online ISBN: 978-3-642-79629-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics