A Formal Theory of Creativity to Model the Creation of Art

  • Jürgen Schmidhuber


According to the Formal Theory of Creativity (1990–2010), a creative agent—one that never stops generating non-trivial, novel, and surprising behaviours and data—must have two learning components: a general reward optimiser or reinforcement learner, and an adaptive encoder of the agent’s growing data history (the record of the agent’s interaction with its environment). The learning progress of the encoder is the intrinsic reward for the reward optimiser. That is, the latter is motivated to invent interesting spatio-temporal patterns that the encoder does not yet know but can easily learn to encode better with little computational effort. To maximise expected reward (in the absence of external reward), the reward optimiser will create more and more-complex behaviours that yield temporarily surprising (but eventually boring) patterns that make the encoder quickly improve. I have argued that this simple principle explains science, art, music and humour. It is possible to rigorously formalise it and implement it on learning machines, thus building artificial robotic scientists and artists equipped with curiosity and creativity. I summarise my work on this topic since 1990, and present a previously unpublished low-complexity artwork computable by a very short program discovered through active search for novel patterns according to the principles of the theory.


Formal Theory Compression Algorithm Subjective Observer Short Program Intrinsic Reward 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This chapter draws heavily from previous publications (Schmidhuber 2006a; 2007b; 2009c; 2009b; 2009a; 2010). Thanks to Jon McCormack, Mark d’Inverno, Benjamin Kuipers, Herbert W. Franke, Marcus Hutter, Andy Barto, Jonathan Lansey, Julian Togelius, Faustino J. Gomez, Giovanni Pezzulo, Gianluca Baldassarre, Martin Butz, for useful comments that helped to improve this chapter, or earlier papers on this subject.


  1. Bense, M. (1969). Einführung in die informationstheoretische Ästhetik. Grundlegung und Anwendung in der Texttheorie (Introduction to information-theoretical aesthetics. Foundation and application to text theory). Rowohlt Taschenbuch Verlag. Google Scholar
  2. Berlyne, D. E. (1950). Novelty and curiosity as determinants of exploratory behavior. British Journal of Psychology, 41, 68–80. Google Scholar
  3. Berlyne, D. E. (1960). Conflict, arousal, and curiosity. New York: McGraw-Hill. CrossRefGoogle Scholar
  4. Birkhoff, G. D. (1933). Aesthetic measure. Cambridge: Harvard University Press. zbMATHGoogle Scholar
  5. Collingwood, R. G. (1938). The principles of art. London: Oxford University Press. Google Scholar
  6. Cuccu, G., Luciw, M., Schmidhuber, J., & Gomez, F. (2011). Intrinsically motivated evolutionary search for vision-based reinforcement learning. In Proceedings of the 2011 IEEE conference on development and learning and epigenetic robotics IEEE-ICDL-EPIROB. New York: IEEE Press. Google Scholar
  7. Danto, A. (1981). The transfiguration of the commonplace. Cambridge: Harvard University Press. Google Scholar
  8. Dutton, D. (2002). Aesthetic universals. In B. Gaut & D. M. Lopes (Eds.), The Routledge companion to aesthetics. Google Scholar
  9. Frank, H. G. (1964). Kybernetische Analysen subjektiver Sachverhalte. Quickborn: Verlag Schnelle. Google Scholar
  10. Frank, H. G., & Franke, H. W. (2002). Ästhetische Information. Estetika informacio. Eine Einführung in die kybernetische Ästhetik. Kopäd Verlag. Google Scholar
  11. Franke, H. W. (1979). Kybernetische Ästhetik. Phänomen kunst (3rd ed.). Munich: Ernst Reinhardt Verlag. Google Scholar
  12. Goodman, N. (1968). Languages of art: an approach to a theory of symbols. Indianapolis: The Bobbs-Merrill Company. Google Scholar
  13. Harlow, H. F., Harlow, M. K., & Meyer, D. R. (1950). Novelty and curiosity as determinants of exploratory behavior. Journal of Experimental Psychology, 41, 68–80. Google Scholar
  14. Huffman, D. A. (1952). A method for construction of minimum-redundancy codes. Proceedings IRE, 40, 1098–1101. CrossRefGoogle Scholar
  15. Hutter, M. (2005). Universal artificial intelligence: sequential decisions based on algorithmic probability. Berlin: Springer. On J. Schmidhuber’s SNF grant 20-61847. zbMATHGoogle Scholar
  16. Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: a survey. Journal of AI research, 4, 237–285. Google Scholar
  17. Kant, I. (1781). Critik der reinen Vernunft. Google Scholar
  18. Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1, 1–11. Google Scholar
  19. Levin, L. A. (1973). Universal sequential search problems. Problems of Information Transmission, 9(3), 265–266. Google Scholar
  20. Li, M., & Vitányi, P. M. B. (1997). An introduction to Kolmogorov complexity and its applications (2nd ed.). Berlin: Springer. zbMATHGoogle Scholar
  21. Luciw, M., Graziano, V., Ring, M., & Schmidhuber, J. (2011). Artificial curiosity with planning for autonomous perceptual and cognitive development. In Proceedings of the first joint conference on development learning and on epigenetic robotics ICDL-EPIROB, Frankfurt. Google Scholar
  22. Mandelbrot, B. (1982). The fractal geometry of nature. San Francisco: Freeman. zbMATHGoogle Scholar
  23. Moles, A. (1968). Information theory and esthetic perception. Champaign: University of Illinois Press. Google Scholar
  24. Nake, F. (1974). Ästhetik als Informationsverarbeitung. Berlin: Springer. CrossRefGoogle Scholar
  25. Ngo, H., Ring, M., & Schmidhuber, J. (2011). Compression Progress-based curiosity drive for developmental learning. In Proceedings of the 2011 IEEE conference on development and learning and epigenetic robotics IEEE-ICDL-EPIROB. New York: IEEE Press. Google Scholar
  26. Piaget, J. (1955). The child’s construction of reality. London: Routledge and Kegan Paul. Google Scholar
  27. Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471. zbMATHCrossRefGoogle Scholar
  28. Schaul, T., Pape, L., Glasmachers, T., Graziano, V., & Schmidhuber, J. (2011a). Coherence progress: a measure of interestingness based on fixed compressors. In Fourth conference on artificial general intelligence (AGI). Google Scholar
  29. Schaul, T., Sun, Y., Wierstra, D., Gomez, F., & Schmidhuber, J. (2011b). Curiosity-driven optimization. In IEEE congress on evolutionary computation (CEC), New Orleans, USA. Google Scholar
  30. Schmidhuber, J. (1991a). Curious model-building control systems. In Proceedings of the international joint conference on neural networks, Singapore (Vol. 2, pp. 1458–1463). New York: IEEE Press. Google Scholar
  31. Schmidhuber, J. (1991b). A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer & S. W. Wilson (Eds.), Proc. of the international conference on simulation of adaptive behavior: from animals to animats (pp. 222–227). Cambridge: MIT Press/Bradford Books. Google Scholar
  32. Schmidhuber, J. (1992). Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2), 234–242. CrossRefGoogle Scholar
  33. Schmidhuber, J. (1997a). A computer scientist’s view of life, the universe, and everything. In C. Freksa, M. Jantzen & R. Valk (Eds.), Lecture notes in computer science: Vol. 1337. Foundations of computer science: potential—theory—cognition (pp. 201–208). Berlin: Springer. CrossRefGoogle Scholar
  34. Schmidhuber, J. (1997b). Femmes fractales. Google Scholar
  35. Schmidhuber, J. (1997c). Low-complexity art. Leonardo, Journal of the International Society for the Arts, Sciences, and Technology, 30(2), 97–103. Google Scholar
  36. Schmidhuber, J. (1998). Facial beauty and fractal geometry (Technical report TR IDSIA-28-98). IDSIA. Published in the Cogprint Archive.
  37. Schmidhuber, J. (1999). Artificial curiosity based on discovering novel algorithmic predictability through coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao & Z. Zalzala (Eds.), Congress on evolutionary computation (pp. 1612–1618). New York: IEEE Press. Google Scholar
  38. Schmidhuber, J. (2002a). Exploring the predictable. In A. Ghosh & S. Tsuitsui (Eds.), Advances in evolutionary computing (pp. 579–612). Berlin: Springer. Google Scholar
  39. Schmidhuber, J. (2002b). Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. International Journal of Foundations of Computer Science, 13(4), 587–612. MathSciNetzbMATHCrossRefGoogle Scholar
  40. Schmidhuber, J. (2002c). The speed prior: a new simplicity measure yielding near-optimal computable predictions. In J. Kivinen & R. H. Sloan (Eds.), Lecture notes in artificial intelligence. Proceedings of the 15th annual conference on computational learning theory (COLT 2002), Sydney, Australia (pp. 216–228). Berlin: Springer. Google Scholar
  41. Schmidhuber, J. (2004). Optimal ordered problem solver. Machine Learning, 54, 211–254. zbMATHCrossRefGoogle Scholar
  42. Schmidhuber, J. (2006a). Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science, 18(2), 173–187. CrossRefGoogle Scholar
  43. Schmidhuber, J. (2006b). Randomness in physics. Nature, 439(3), 392. Correspondence. MathSciNetCrossRefGoogle Scholar
  44. Schmidhuber, J. (2007a). Alle berechenbaren universen (all computable universes). Spektrum der Wissenschaft Spezial (German edition of Scientific American), 3, 75–79. Google Scholar
  45. Schmidhuber, J. (2007b). Simple algorithmic principles of discovery, subjective beauty, selective attention, curiosity & creativity. In LNAI: Vol. 4755. Proc. 10th intl. conf. on discovery science (DS 2007) (pp. 26–38). Berlin: Springer. Joint invited lecture for ALT 2007 and DS 2007, Sendai, Japan, 2007. Google Scholar
  46. Schmidhuber, J. (2009a). Art & science as by-products of the search for novel patterns, or data compressible in unknown yet learnable ways. In M. Botta (Ed.), Multiple ways to design research. Research cases that reshape the design discipline (pp. 98–112). Berlin: Springer. Swiss design network—et al. Edizioni. Google Scholar
  47. Schmidhuber, J. (2009b). Driven by compression progress: a simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. In G. Pezzulo, M. V. Butz, O. Sigaud & G. Baldassarre (Eds.), Lecture notes in computer science: Vol. 5499. Anticipatory behavior in adaptive learning systems. From psychological theories to artificial cognitive systems (pp. 48–76). Berlin: Springer. Google Scholar
  48. Schmidhuber, J. (2009c). Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. SICE Journal of the Society of Instrument and Control Engineers, 48(1), 21–32. Google Scholar
  49. Schmidhuber, J. (2009d). Ultimate cognition à la Gödel. Cognitive Computation, 1(2), 177–193. CrossRefGoogle Scholar
  50. Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3), 230–247. CrossRefGoogle Scholar
  51. Schmidhuber, J., & Heil, S. (1996). Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1), 142–146. CrossRefGoogle Scholar
  52. Schmidhuber, J., Zhao, J., & Wiering, M. (1997). Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28, 105–130. CrossRefGoogle Scholar
  53. Shannon, C. E. (1948). A mathematical theory of communication (parts I and II). Bell System Technical Journal, XXVII, 379–423. MathSciNetGoogle Scholar
  54. Solomonoff, R. J. (1978). Complexity-based induction systems. IEEE Transactions on Information Theory, IT-24(5), 422–432. MathSciNetCrossRefGoogle Scholar
  55. Storck, J., Hochreiter, S., & Schmidhuber, J. (1995). Reinforcement driven information acquisition in non-deterministic environments. In Proceedings of the international conference on artificial neural networks (Vol. 2, pp. 159–164). Paris: EC2 & Cie. Google Scholar
  56. Wallace, C. S., & Boulton, D. M. (1968). An information theoretic measure for classification. Computer Journal, 11(2), 185–194. zbMATHGoogle Scholar
  57. Wallace, C. S., & Freeman, P. R. (1987). Estimation and inference by compact coding. Journal of the Royal Statistical Society, Series B, 49(3), 240–265. MathSciNetzbMATHGoogle Scholar
  58. Wundt, W. M. (1874). Grundzüge der Physiologischen Psychologie. Leipzig: Engelmann. Google Scholar
  59. Zuse, K. (1969). Rechnender raum. Braunschweig: Friedrich Vieweg & Sohn. English translation: Calculating space, MIT Technical Translation AZT-70-164-GEMIT, Massachusetts Institute of Technology (Proj. MAC), Cambridge, Mass, 02139, Feb. 1970. zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.IDSIAUniversity of Lugano & SUPSIManno-LuganoSwitzerland

Personalised recommendations