Skip to main content

Child: A First Step Towards Continual Learning

  • Chapter
Learning to Learn

Abstract

Continual learning is the constant development of increasingly complex behaviors; the process of building more complicated skills on top of those already developed. A continual-learning agent should therefore learn incrementally and hierarchically. This paper describes CHILD, an agent capable of Continual, Hierarchical, Incremental Learning and Development. CHILD can quickly solve complicated non-Markovian reinforcement-learning tasks and can then transfer its skills to similar but even more complicated tasks, learning these faster still.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Albus, J. S. (1979). Mechanisms of planning and problem solving in the brain. Mathematical Biosciences, 45:247–293.

    Article  Google Scholar 

  • Barr, A. and Feigenbaum, E. A. (1981). The Handbook of Artificial Intelligence, volume 1. Los Altos, California: William Kaufmann, Inc.

    MATH  Google Scholar 

  • Baxter, J. (1995). Learning model bias. Technical Report NC-TR-95-46, Royal Holloway College, University of London, Department of Computer Science.

    Google Scholar 

  • Caruana, R. (1993). Multitask learning: A knowledge-based source of inductive bias. In Machine Learning: Proceedings of the tenth International Conference, pages 41–48. Morgan Kaufmann Publishers.

    Google Scholar 

  • Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the National Conference on Artificial Intelligence (AAAI-92). Cambridge, MA: AAAI/MIT Press.

    Google Scholar 

  • Dawkins, R. (1976). Hierarchical organisation: a candidate principle for ethology. InBateson, P. P. G. and Hinde, R. A., editors, Growing Points in Ethology, pages 7–54. Cambridge: Cambridge University Press.

    Google Scholar 

  • Dayan, P. and Hinton, G. E. (1993). Feudal reinforcement learning. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 271–278. San Mateo, California: Morgan Kaufmann Publishers.

    Google Scholar 

  • Drescher, G. L. (1991). Made-Up Minds: A Constructivist Approach to Artificial Intelligence. Cambridge, Massachusetts: MIT Press.

    MATH  Google Scholar 

  • Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48:71–79.

    Article  Google Scholar 

  • Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In Lippmann, R. P., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 190–196. San Mateo, California: Morgan Kaufmann Publishers.

    Google Scholar 

  • Giles, C, Chen, D., Sun, G., Chen, H., Lee, Y., and Goudreau, M. (1995). Constructive learning of recurrent neural networks: Problems with recurrent cascade correlation and a simple solution. IEEE Transactions on Neural Networks, 6(4): 829.

    Article  Google Scholar 

  • Jameson, J. W. (1992). Reinforcement control with hierarchical backpropagated adaptive critics. Submitted to Neural Networks.

    Google Scholar 

  • Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. ICS Report 8604, Institute for Cognitive Science, University of California, San Diego.

    Google Scholar 

  • Kaelbling, L. P. (1993a). Hierarchical learning in stochastic domains: Preliminary results. In Machine Learning: Proceedings of the tenth International Conference, pages 167–173. Morgan Kaufmann Publishers.

    Google Scholar 

  • Kaelbling, L. P. (1993b). Learning to achieve goals. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1094–1098. Chambéry, France: Morgan Kaufmann.

    Google Scholar 

  • Laird, J. E., Rosenbloom, P. S., and Newell, A. (1986). Chunking in soar: The anatomy of a general learning mechanism. Machine Learning, 1:11–46.

    Google Scholar 

  • Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293–321.

    Google Scholar 

  • Lin, L.-J. (1993). Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University. Also appears as Technical Report CMU-CS-93-103.

    Google Scholar 

  • McCallum, A. K. (1996). Learning to use selective attention and short-term memory in sequential tasks. In From Animals to Animats, Fourth International Conference on Simulation of Adaptive Behavior, (SAB’96).

    Google Scholar 

  • McCallum, R. A. (1993). Overcoming incomplete perception with Utile Distinction Memory. In Machine Learning: Proceedings of the Tenth International Conference, pages 190–196. Morgan Kaufmann Publishers.

    Google Scholar 

  • Pollack, J. B. (1991). The induction of dynamical recognizers. Machine Learning, 7:227–252.

    Google Scholar 

  • Pratt, L. Y. (1993). Discriminability-based transfer between neural networks. In Giles, C, Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 204–211. San Mateo, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  • Ring, M. B. (1993). Learning sequential tasks by incrementally adding higher orders. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 115-122. San Mateo, California: Morgan Kaufmann Publishers.

    Google Scholar 

  • Ring, M. B. (1994). Continual Learning in Reinforcement Environments. PhD thesis, University of Texas at Austin, Austin, Texas 78712.

    Google Scholar 

  • Ring, M. B. (1996). Finding promising exploration regions by weighting expected navigation costs. Arbeitspapiere der GMD 987, GMD — German National Research Center for Information Technology.

    Google Scholar 

  • Robinson, A. J. and Fallside, F. (1987). The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.l, Cambridge University Engineering Department.

    Google Scholar 

  • Roitblat, H. L. (1988). A cognitive action theory of learning. In Delacour, J. and Levy, J. C. S., editors, Systems with Learning and Memory Abilities, pages 13–26. Elsevier Science Publishers B.V. (North-Holland).

    Google Scholar 

  • Roitblat, H. L. (1991). Cognitive action theory as a control architecture. In Meyer, J. A. and Wilson, S. W., editors, From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, pages 444–450. MIT Press.

    Google Scholar 

  • Sanger, T. D. (1991). A tree structured adaptive network for function approximation in highdimensional spaces. IEEE Transactions on Neural Networks, 2(2):285–301.

    Article  Google Scholar 

  • Schmidhuber, J. (1992). Learning unambiguous reduced sequence descriptions. In Moody, J. E., Hanson, S. J., and Lippman, R. P., editors, Advances in Neural Information Processing Systems 4, pages 291–298. San Mateo, California: Morgan Kaufmann Publishers.

    Google Scholar 

  • Schmidhuber, J. (1994). On learning how to learn learning strategies. Technical Report FKI-198-94 (revised), Technische Universität München, Institut für Informatik.

    Google Scholar 

  • Schmidhuber, J. and Wahnsiedler, R. (1993). Planning simple trajectories using neural subgoal generators. In Meyer, J. A., Roitblat, H., and Wilson, S., editors, From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pages 196–199. MIT Press.

    Google Scholar 

  • Sharkey, N. E. and Sharkey, A. J. (1993). adaptive generalisation. Artificial Intelligence Review, 7:313–328.

    Article  MATH  Google Scholar 

  • Silver, D. L. and Mercer, R. E. (1995). Toward a model of consolidation: The retention and transfer of neural net task knowledge. In Proceedings of the INNS World Congress on Neural Networks, volume III, pages 164–169. Washington, DC.

    Google Scholar 

  • Singh, S. P. (1992). Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323–340.

    MATH  Google Scholar 

  • Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In Touretzky, D. S., Mozer, M. C., and Hasselno, M. E., editors, Advances in Neural Information Processing Systems 8. MIT Press.

    Google Scholar 

  • Thrun, S. and Schwartz, A. (1995). Finding structure in reinforcement learning. In Tesauro, G., Touretzky, D., and Leen, T, editors, Advances in Neural Information Processing Systems 7, pages 385–392. MIT Press.

    Google Scholar 

  • Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis, King”s College.

    Google Scholar 

  • Wilson, S. W. (1989). Hierarchical credit allocation in a classifier system. In Elzas, M. S., Ören, T. I., and Zeigler, B. P., editors, Modeling and Simulation Methodology. Elsevier Science Publishers B.V.

    Google Scholar 

  • Wixson, L. E. (1991). Scaling reinforcement learning techniques via modularity. In Birnbaum, L. A. and Collins, G. C, editors, Machine Learning: Proceedings of the Eighth International Workshop (ML9I), pages 368–372. Morgan Kaufmann Publishers.

    Google Scholar 

  • Wynn-Jones, M. (1993). Node splitting: A constructive algorithm for feed-forward neural networks. Neural Computing and Applications, 1(1): 17–22.

    Article  Google Scholar 

  • Yu, Y.-H. and Simmons, R. F. (1990). Extra ouput biased learning. In Proceedings of the International Joint Conference on Neural Networks. Hillsdale, NJ: Erlbaum Associates.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media New York

About this chapter

Cite this chapter

Ring, M.B. (1998). Child: A First Step Towards Continual Learning. In: Thrun, S., Pratt, L. (eds) Learning to Learn. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5529-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5529-2_11

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7527-2

  • Online ISBN: 978-1-4615-5529-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics