A First Approach of a New Learning Strategy: Learning by Confirmation

  • Alejandro Carpio
  • Matilde Santos
  • José Antonio Martín
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 214)


Learning by confirmation is a new learning approach, which combines two types of supervised learning strategies: reinforcement learning and learning by examples. In this paper, we show how this new strategy accelerates the learning process when some knowledge is introduced to the reinforcement algorithm. The learning proposal has been tested on a real-time device, a Lego Mindstorms NXT 2.0 robot that has been configured as an inverted pendulum. The methodology shows good performance and the results are quite promising.


Learning by Confirmation Reinforcement Learning Learning by examples Lego NXT Inverted Pendulum 



This work has been partially supported by the Spanish project DPI2009-14552-C02-01.


  1. 1.
    Alpaydin E (2004) Introduction to machine learning. The MIT Press, Cambridge, MAGoogle Scholar
  2. 2.
    Russel S, Norvig P (2004) Artificial intelligence: a modern approach. Prentice-Hall, Englewood Cliffs, NJ Google Scholar
  3. 3.
    Sutton S, Barto A (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge, MAGoogle Scholar
  4. 4.
    Martín-H A, Santos M (2010) Aprendizaje por refuerzo. In: Aprendizaje automático, chapter 12, RA-MA, Madrid, SpainGoogle Scholar
  5. 5.
    Karamouzas I, Overmars MH (2008) Adding variation to path planning. Comp Anim Virtual Worlds 19: 283–293Google Scholar
  6. 6.
    Santos M, Martín-H JA, López V, Botella G (2012) Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems. Knowl-Based Syst 32:28–36CrossRefGoogle Scholar
  7. 7.
    Garzés M, Kudenko D (2010) Online learning of shaping rewards in reinforcement learning. Neural Netw 23(4):541–550CrossRefGoogle Scholar
  8. 8.
    Alvarez C, Santos M, López V (2010) Reinforcement learning vs. A* in a role playing game benchmark scenario. In: Ruan D, Li T, Xu Y, Chen G, Kerre E (eds) Computational intelligence: foundations and applications. World Scientific Proc. Series on Computer Engineering and Information Science Vol. 4. Computational Intelligente. Foundations and Applications. Proc. of the 9th Int. FLINS Conference, pp 644–650 Google Scholar
  9. 9.
    Bertsekas DP (1995) Dynamic programming and optimal control. Athena Scientific, Belmont, MassachusettsMATHGoogle Scholar
  10. 10.
  11. 11.
    Berthilsson S, Danmark A, Hammarqvist U, Nygren H, Savin V (2009) Embedded Control Systems LegoWay
  12. 12.
    Astrom KJ, Hagglund T (2005) Advanced PID control. Research Triangle Park, NC: ISA–The Instrumentation, Systems, and Automation SocietyGoogle Scholar
  13. 13.
    Kelly JF (2007) Lego mindstorms 2.0 NXT-G programming guide. Apress, Berkeley, CAGoogle Scholar
  14. 14.
    Watkins C, Dayan P (1992) Q-learning. Mach Learn 8:279–292MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Alejandro Carpio
    • 1
  • Matilde Santos
    • 1
  • José Antonio Martín
    • 1
  1. 1.Computer Architecture and Systems EngineeringFacultad de Informática, Universidad Complutense de MadridMadridSpain

Personalised recommendations