Neural Processing Letters

, Volume 38, Issue 2, pp 117–129 | Cite as

Towards a Self-Learning Agent: Using Ranking Functions as a Belief Representation in Reinforcement Learning

  • Klaus HämingEmail author
  • Gabriele Peters


We propose a combination of belief revision and reinforcement learning which leads to a self-learning agent. The agent shows six qualities we deem necessary for a successful and adaptive learner. This is achieved by representing the agent’s belief in two different levels, one numerical and one symbolical. While the former is implemented using basic reinforcement learning techniques, the latter is represented by Spohn’s ranking functions. To make these ranking functions fit into a reinforcement learning framework, we studied the revision process and identified key weaknesses of the to-date approach. Despite the fact that the revision was modeled to support frequent updates, we propose and justify an alternative revision which leads to more plausible results. We show in an example application the benefits of the new approach, including faster learning and the extraction of learned rules.


Hybrid learning system Belief revision Ranking functions Reinforcement learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alchourron CE, Gardenfors P, Makinson D (1985) On the logic of theory change partial meet contraction and revision functions. J Symbol Log 50(2): 510–530MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Anderson JR (1983) The architecture of cognition. Hardvard University Press, CambridgeGoogle Scholar
  3. 3.
    Blockeel H, De Raedt L (1998) Top-down induction of first-order logical decision trees. Artif Intell 101: 285–297MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Darwiche A, Pearl J (1996) On the logic of iterated belief revision. Artif Intell 89: 1–29MathSciNetCrossRefGoogle Scholar
  5. 5.
    Driessens K, Ramon J (2003) Relational instance based regression for relational reinforcement learning. In: Proceedings of the twentieth international conference on machine learning, pp 123–130Google Scholar
  6. 6.
    Dzeroski S, De Raedt L, Driessens K (2001) Relational reinforcement learning. Mach Learn 43: 7–52CrossRefzbMATHGoogle Scholar
  7. 7.
    Gartner T, Driessens K, Ramon J (2003) Graph kernels and gaussian processes for relational reinforcement learning. In: Inductive logic programming, 13th international conference, ILPGoogle Scholar
  8. 8.
    Gombert JE (2003) Implicit and explicit learning to read: implication as for subtypes of dyslexia. Curr Psychol Lett 1(10)Google Scholar
  9. 9.
    Häming K, Peters G (2010) An alternative approach to the revision of ordinal conditional functions in the context of multi-valued logic. In: Diamantaras K, Duch W, Iliadis LS (eds) 20th international conference on artificial neural networks, September 15–18. Springer, Thessaloniki, pp 200–203Google Scholar
  10. 10.
    Häming K, Peters G (2011) A hybrid learning system for object recognition. In: 8th international conference on informatics in control, automation, and robotics (ICINCO 2011), Noordwijkerhout, The Netherlands, July 28–31Google Scholar
  11. 11.
    Häming K, Peters G (2011) Ranking functions in large state spaces. In: 7th international conference on artificial intelligence applications and innovations (AIAI 2011), September 15–18, Corfu, GreeceGoogle Scholar
  12. 12.
    Kern-Isberner G (2001) Conditionals in nonmonotonic reasoning and belief revision: considering conditionals as agents. Springer, New YorkCrossRefGoogle Scholar
  13. 13.
    Leopold T, Kern Isberner G, Peters G,(2008) Combining reinforcement learning and belief revision: a learning system for active vision. In: Everingham M, Needham C, Fraile R (eds) 19th British machine vision conference (BMVC 2008), September 1–4, vol 1. Leeds, UK, pp 473–482Google Scholar
  14. 14.
    Peters G (2011)Six necessary qualities of self-learning systems—a short brainstorming. In: International conference on neural computation theory and applications (NCTA 2011), October, Paris, France, pp 24–26Google Scholar
  15. 15.
    Reber AS (1989) Implicit learning and tacit knowledge. J Exper Psycol Gen 3(118): 219–235CrossRefGoogle Scholar
  16. 16.
    Robinson, JA, Voronkov, A (eds) (2001) Handbook of automated reasoning (in 2 volumes). Elsevier, New YorkGoogle Scholar
  17. 17.
    Spohn W (August 1988) Ordinal conditional functions: a dynamic theory of epistemic states. In: Causation in decision, belief change and statistics, pp 105–134Google Scholar
  18. 18.
    Spohn W (2009) A survey of ranking theory. In: Degrees of belief. Springer, New YorkGoogle Scholar
  19. 19.
    Sun R, Merrill E, Peterson T (2001) From implicit skills to explicit knowledge: a bottom-up model of skill learning. Cogn Sci 25: 203–244CrossRefGoogle Scholar
  20. 20.
    Sun R, Terry C, Slusarz P (2005) The interaction of the explicit and the implicit in skill learning a dual-process approach. Psychol Rev 112: 159–192CrossRefGoogle Scholar
  21. 21.
    Sun R, Zhang X, Slusarz P, Mathews R (2006) The interaction of implicit learning, explicit hypothesis testing, and implicit-to-explicit knowledge extraction. Neural Netw 1: 34–47Google Scholar
  22. 22.
    Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
  23. 23.
    Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to grow a mind: statistics, structure, and abstraction. Science 331(6022): 1279–1285MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Ye C, Yung NHC, Wang D (2003) A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. IEEE Trans Syst Man Cybern B 33(1): 17–27CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  1. 1.University of HagenHagenGermany

Personalised recommendations