Advances in Evolutionary Computing pp 579-612 | Cite as

# Exploring the Predictable

## Abstract

Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events. It constructs probabilistic algorithms that (1) control interaction with the world, (2) map event sequences to abstract internal representations (IRs), (3) predict IRs from IRs computed earlier. Its goal is to create novel algorithms generating IRs useful for correct IR predictions, without wasting time on those learned before. This requires an adaptive novelty measure which is implemented by a coevolutionary scheme involving two competing modules *collectively* designing (initially random) algorithms representing experiments. Using special instructions, the modules can bet on the outcome of IR predictions computed by algorithms they have agreed upon. If their opinions differ then the system checks who’s right, punishes the loser (the surprised one), and rewards the winner. An evolutionary or reinforcement learning algorithm forces each module to maximize reward. This motivates both modules to lure each other into agreeing upon experiments involving predictions that surprise it. Since each module essentially can veto experiments it does not consider profitable, the system is motivated to focus on those computable aspects of the environment where both modules still have confident but different opinions. Once both share the same opinion on a particular issue (via the loser’s learning process, e.g., the winner is simply copied onto the loser), the winner loses a source of reward — an incentive to shift the focus of interest onto novel experiments. My simulations include an example where surprise-generation of this kind helps to speed up external reward.

## Keywords

Neural Computation Kolmogorov Complexity Basic Cycle Module Modification Instruction Pointer## Preview

Unable to display preview. Download preview PDF.

## References

- 1.Fedorov, V. V. (1972)
*Theory of optimal experiments*. Academic PressGoogle Scholar - 2.Hwang, J., Choi, J., Oh, S., Marks II, R. J. (1991) Query-based learning applied to partially trained multilayer perceptrons.
*IEEE Transactions on Neural Networks*,**2**, 131–136CrossRefGoogle Scholar - 3.MacKay, D. J. C. (1992) Information-based objective functions for active data selection.
*Neural Computation*,**4**, 550–604Google Scholar - 4.Plutowski, M., Cottrell, G., White, H. (1994) Learning Mackey-Glass from 25 examples, plus or minus 2. In J. Cowan, G. Tesauro, and J. Alspector, editors,
*Advances in Neural Information Processing Systems***6**, 1135–1142. Morgan KaufmannGoogle Scholar - 5.Colin, D. A. (1994) Neural network exploration using optimal experiment design. In J. Cowan, G. Tesauro, and J. Alspector, editors,
*Advances in Neural Information Processing Systems***6**, 679–686. Morgan KaufmannGoogle Scholar - 6.Shannon, C. E. (1948) A mathematical theory of communication (parts I and II).
*Bell System Technical Journal*,**XXVII**, 379–423MathSciNetGoogle Scholar - 7.Schmidhuber, J. (1991) A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors,
*Proceedings of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats*, 222–227. MIT Press/Bradford BooksGoogle Scholar - 8.Schmidhuber, J. (1991) Curious model-building control systems. In
*Proceedings of the International Joint Conference on Neural Networks, Singapore*,**2**, 1458–1463. IEEEGoogle Scholar - 9.Storck, J., Hochreiter, S., Schmidhuber, J. (1995) Reinforcement driven information acquisition in non-deterministic environments. In
*Proceedings of the International Conference on Artificial Neural Networks, Paris*,**2**, 159–164. EC2 & CieGoogle Scholar - 10.Schmidhuber, J., Prelinger, D. (1993) Discovering predictable classifications.
*Neural Computation*,**5**, 625–635CrossRefGoogle Scholar - 11.
- 12.Holland, J. H. (1985) Properties of the bucket brigade. In
*Proceedings of an International Conference on Genetic Algorithms*, Hillsdale, NJGoogle Scholar - 13.Schmidhuber, J., Zhao, J., Wiering, M. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement.
*Machine Learning*,**28**, 105–130Google Scholar - 14.Schmidhuber, J., Zhao, J., Schraudolph, N. (1997) Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors,
*Learning to learn*, 293–309. KluwerGoogle Scholar - 15.Kolmogorov, A. N. (1965) Three approaches to the quantitative definition of information.
*Problems of Information Transmission*,**1**, 1–11Google Scholar - 16.Kwee, I., Hutter, M., and Schmidhuber, J. (2001) Market-based reinforcement learning in partially observable worlds.
*Proceedings of the International Conference on Artificial Neural Networks (ICANN-2001)*,*in press.*Google Scholar - 17.Chaitin, G. J. (1969) On the length of programs for computing finite binary sequences: statistical considerations.
*Journal of the ACM*,**16**, 145–159MathSciNetzbMATHCrossRefGoogle Scholar - 18.Solomonoff, R. J. (1964) A formal theory of inductive inference. Part I.
*Information and Control*,**7**, 1–22MathSciNetzbMATHCrossRefGoogle Scholar - 19.Li, M., Vitänyi, P. M. B. (1997)
*An Introduction to Kolmogorov Complexity and its Applications*. SpringerGoogle Scholar - 20.Schmidhuber, J. (1999) A general method for incremental self-improvement and multi-agent learning. In X. Yao, editor,
*Evolutionary Computation: Theory and Applications*, 81–123. World ScientificGoogle Scholar - 21.Schmidhuber, J. (1997) Discovering neural nets with low Kolmogorov complexity and high generalization capability.
*Neural Networks*,**10**, 857–873CrossRefGoogle Scholar - 22.Lin, L. J. (1993)
*Reinforcement Learning for Robots Using Neural Networks*. PhD thesis, Carnegie Mellon University, PittsburghGoogle Scholar - 23.Geman, S., Bienenstock, E., Doursat, R. (1992) Neural networks and the bias/variance dilemmA.
*Neural Computation*,**4**, 1–58CrossRefGoogle Scholar - 24.Clarke, A. C. (1991)
*The ghost from the grand banks*.Google Scholar - 25.Hochreiter, S., Schmidhuber, J. (1997) Flat minimA.
*Neural Computation*,**9**, 1–42zbMATHCrossRefGoogle Scholar - 26.Hillis, D. (1992) Co-evolving parasites improve simulated evolution as an optimization procedure. In CG. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors,
*Artificial Life II*, 313–324. Addison WesleyGoogle Scholar - 27.Pollack, J. B., Blair, A. D. (1997) Why did TD-Gammon work? In M. C. Mozer, M. I. Jordan, and S. Petsche, editors,
*Advances in Neural Information Processing Systems*,**9**, 10–16Google Scholar - 28.Samuel, A. L. (1959) Some studies in machine learning using the game of checkers.
*IBM Journal on Research and Development*,**3**, 210–229CrossRefGoogle Scholar - 29.Tesauro, G. (1994) TD-gammon, a self-teaching backgammon program, achieves master-level play.
*Neural Computation*,**6**, 215–219CrossRefGoogle Scholar - 30.Schmidhuber, J. (1992) Learning factorial codes by predictability minimization.
*Neural Computation*,**4**, 863–879CrossRefGoogle Scholar - 31.Schraudolph, N., Sejnowski, T. J. (1993) Unsupervised discrimination of clustered data via optimization of binary information gain. In Stephen Jose Hanson, Jack D. Cowan, and C. Lee Giles, editors,
*Advances in Neural Information Processing Systems*,**5**, 499–506Google Scholar - 32.Schmidhuber, J., Eldracher, M., Foltin, B. (1996) Semilinear predictability minimization produces well-known feature detectors.
*Neural Computation*,**8**, 773–786CrossRefGoogle Scholar - 33.Schraudolph, N. N., Eldracher, M., Schmidhuber, J. (1999) Processing images by semi-linear predictability minimization.
*Network: Computation in Neural Systems*,**10**, 133–169CrossRefGoogle Scholar - 34.Nake, F. (1974)
*Ästhetik als Informationsverarbeitung*. SpringerGoogle Scholar - 35.Schmidhuber, J. (1997) Low-complexity art.
*Leonardo, Journal of the International Society for the Arts, Sciences, and Technology*,**30**, 97–103Google Scholar - 36.Schmidhuber, J. (1998) Facial beauty and fractal geometry. Technical Report IDSIA-28-98, IDSIA, Also published in the Cogprint Archive: http://cogprints.soton.ac.uk
- 37.Wilson, S. W. (1994) ZCS: A zeroth level classifier system.
*Evolutionary Computation*,**2**, 1–18CrossRefGoogle Scholar - 38.Wilson, S. W. (1995) Classifier fitness based on accuracy.
*Evolutionary Computation*,**3**, 149–175CrossRefGoogle Scholar - 39.Weiss, G. (1994) Hierarchical chunking in classifier systems. In
*Proceedings of the 12th National Conference on Artificial Intelligence*,**2**, 1335–1340Google Scholar - 40.Weiss, G., Sen, S. (eds.) (1996)
*Adaption and Learning in Multi-Agent Systems*. LNAI 1042, Springer-VerlagGoogle Scholar - 41.Schmidhuber, J. (1987) Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-… hook. Institut für Informatik, Technische Universität MünchenGoogle Scholar
- 42.Schmidhuber, J. (1989) A local learning algorithm for dynamic feedforward and recurrent networks.
*Connection Science*,**1**, 403–412CrossRefGoogle Scholar - 43.Baum, E. B., Durdanovic, I. (1999) Toward a model of mind as an economy of agents.
*Machine Learning*,**35**, 155–185zbMATHCrossRefGoogle Scholar - 44.Wolpert, D. H., Turner, K., Frank, J. (1999) Using collective intelligence to route internet traffic. In M. Kearns, S. A. Solla, and D. Cohn, editors,
*Advances in Neural Information Processing Systems 12*Google Scholar - 45.Schmidhuber, J. (1998) What’s interesting? Technical Report IDSIA-35-97, IDSIA, 1997. ftp://ftp.idsiA.ch/pub/juergen/interest.ps.gz>; extended abstract in Proc. Snowbird’98, Utah
- 46.Schmidhuber, J. (1999) Artificial curiosity based on discovering novel algorithmic predictability through coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, and Z. Zalzala, editors,
*Congress on Evolutionary Computation*, 1612–1618. IEEE PressGoogle Scholar