Abstract
Continual learning is the constant development of increasingly complex behaviors; the process of building more complicated skills on top of those already developed. A continual-learning agent should therefore learn incrementally and hierarchically. This paper describes CHILD, an agent capable of Continual, Hierarchical, Incremental Learning and Development. CHILD can quickly solve complicated non-Markovian reinforcement-learning tasks and can then transfer its skills to similar but even more complicated tasks, learning these faster still.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Albus, J. S. (1979). Mechanisms of planning and problem solving in the brain. Mathematical Biosciences, 45:247–293.
Barr, A. and Feigenbaum, E. A. (1981). The Handbook of Artificial Intelligence, volume 1. Los Altos, California: William Kaufmann, Inc.
Baxter, J. (1995). Learning model bias. Technical Report NC-TR-95-46, Royal Holloway College, University of London, Department of Computer Science.
Caruana, R. (1993). Multitask learning: A knowledge-based source of inductive bias. In Machine Learning: Proceedings of the tenth International Conference, pages 41–48. Morgan Kaufmann Publishers.
Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the National Conference on Artificial Intelligence (AAAI-92). Cambridge, MA: AAAI/MIT Press.
Dawkins, R. (1976). Hierarchical organisation: a candidate principle for ethology. InBateson, P. P. G. and Hinde, R. A., editors, Growing Points in Ethology, pages 7–54. Cambridge: Cambridge University Press.
Dayan, P. and Hinton, G. E. (1993). Feudal reinforcement learning. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 271–278. San Mateo, California: Morgan Kaufmann Publishers.
Drescher, G. L. (1991). Made-Up Minds: A Constructivist Approach to Artificial Intelligence. Cambridge, Massachusetts: MIT Press.
Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48:71–79.
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In Lippmann, R. P., Moody, J. E., and Touretzky, D. S., editors, Advances in Neural Information Processing Systems 3, pages 190–196. San Mateo, California: Morgan Kaufmann Publishers.
Giles, C, Chen, D., Sun, G., Chen, H., Lee, Y., and Goudreau, M. (1995). Constructive learning of recurrent neural networks: Problems with recurrent cascade correlation and a simple solution. IEEE Transactions on Neural Networks, 6(4): 829.
Jameson, J. W. (1992). Reinforcement control with hierarchical backpropagated adaptive critics. Submitted to Neural Networks.
Jordan, M. I. (1986). Serial order: A parallel distributed processing approach. ICS Report 8604, Institute for Cognitive Science, University of California, San Diego.
Kaelbling, L. P. (1993a). Hierarchical learning in stochastic domains: Preliminary results. In Machine Learning: Proceedings of the tenth International Conference, pages 167–173. Morgan Kaufmann Publishers.
Kaelbling, L. P. (1993b). Learning to achieve goals. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pages 1094–1098. Chambéry, France: Morgan Kaufmann.
Laird, J. E., Rosenbloom, P. S., and Newell, A. (1986). Chunking in soar: The anatomy of a general learning mechanism. Machine Learning, 1:11–46.
Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8:293–321.
Lin, L.-J. (1993). Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University. Also appears as Technical Report CMU-CS-93-103.
McCallum, A. K. (1996). Learning to use selective attention and short-term memory in sequential tasks. In From Animals to Animats, Fourth International Conference on Simulation of Adaptive Behavior, (SAB’96).
McCallum, R. A. (1993). Overcoming incomplete perception with Utile Distinction Memory. In Machine Learning: Proceedings of the Tenth International Conference, pages 190–196. Morgan Kaufmann Publishers.
Pollack, J. B. (1991). The induction of dynamical recognizers. Machine Learning, 7:227–252.
Pratt, L. Y. (1993). Discriminability-based transfer between neural networks. In Giles, C, Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 204–211. San Mateo, CA: Morgan Kaufmann Publishers.
Ring, M. B. (1993). Learning sequential tasks by incrementally adding higher orders. In Giles, C. L., Hanson, S. J., and Cowan, J. D., editors, Advances in Neural Information Processing Systems 5, pages 115-122. San Mateo, California: Morgan Kaufmann Publishers.
Ring, M. B. (1994). Continual Learning in Reinforcement Environments. PhD thesis, University of Texas at Austin, Austin, Texas 78712.
Ring, M. B. (1996). Finding promising exploration regions by weighting expected navigation costs. Arbeitspapiere der GMD 987, GMD — German National Research Center for Information Technology.
Robinson, A. J. and Fallside, F. (1987). The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.l, Cambridge University Engineering Department.
Roitblat, H. L. (1988). A cognitive action theory of learning. In Delacour, J. and Levy, J. C. S., editors, Systems with Learning and Memory Abilities, pages 13–26. Elsevier Science Publishers B.V. (North-Holland).
Roitblat, H. L. (1991). Cognitive action theory as a control architecture. In Meyer, J. A. and Wilson, S. W., editors, From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, pages 444–450. MIT Press.
Sanger, T. D. (1991). A tree structured adaptive network for function approximation in highdimensional spaces. IEEE Transactions on Neural Networks, 2(2):285–301.
Schmidhuber, J. (1992). Learning unambiguous reduced sequence descriptions. In Moody, J. E., Hanson, S. J., and Lippman, R. P., editors, Advances in Neural Information Processing Systems 4, pages 291–298. San Mateo, California: Morgan Kaufmann Publishers.
Schmidhuber, J. (1994). On learning how to learn learning strategies. Technical Report FKI-198-94 (revised), Technische Universität München, Institut für Informatik.
Schmidhuber, J. and Wahnsiedler, R. (1993). Planning simple trajectories using neural subgoal generators. In Meyer, J. A., Roitblat, H., and Wilson, S., editors, From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pages 196–199. MIT Press.
Sharkey, N. E. and Sharkey, A. J. (1993). adaptive generalisation. Artificial Intelligence Review, 7:313–328.
Silver, D. L. and Mercer, R. E. (1995). Toward a model of consolidation: The retention and transfer of neural net task knowledge. In Proceedings of the INNS World Congress on Neural Networks, volume III, pages 164–169. Washington, DC.
Singh, S. P. (1992). Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323–340.
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In Touretzky, D. S., Mozer, M. C., and Hasselno, M. E., editors, Advances in Neural Information Processing Systems 8. MIT Press.
Thrun, S. and Schwartz, A. (1995). Finding structure in reinforcement learning. In Tesauro, G., Touretzky, D., and Leen, T, editors, Advances in Neural Information Processing Systems 7, pages 385–392. MIT Press.
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis, King”s College.
Wilson, S. W. (1989). Hierarchical credit allocation in a classifier system. In Elzas, M. S., Ören, T. I., and Zeigler, B. P., editors, Modeling and Simulation Methodology. Elsevier Science Publishers B.V.
Wixson, L. E. (1991). Scaling reinforcement learning techniques via modularity. In Birnbaum, L. A. and Collins, G. C, editors, Machine Learning: Proceedings of the Eighth International Workshop (ML9I), pages 368–372. Morgan Kaufmann Publishers.
Wynn-Jones, M. (1993). Node splitting: A constructive algorithm for feed-forward neural networks. Neural Computing and Applications, 1(1): 17–22.
Yu, Y.-H. and Simmons, R. F. (1990). Extra ouput biased learning. In Proceedings of the International Joint Conference on Neural Networks. Hillsdale, NJ: Erlbaum Associates.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Ring, M.B. (1998). Child: A First Step Towards Continual Learning. In: Thrun, S., Pratt, L. (eds) Learning to Learn. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5529-2_11
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5529-2_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7527-2
Online ISBN: 978-1-4615-5529-2
eBook Packages: Springer Book Archive