Reinforcement Learning in Single Robot Hose Transport Task: A Physical Proof of Concept

  • Jose Manuel Lopez-GuedeEmail author
  • Julián Estévez
  • Manuel Graña
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 368)


In this paper we address the physical realization of proof of concept experiments demonstrating the suitability of the controllers learned by means of Reinforcement Learning (RL) techniques to accomplish tasks involving Linked Multi-Component Robotic System (LMCRS). In this paper, we deal with the task of transporting a hose by a single robot as a prototypical example of LMCRS, which can be extended to much more complex tasks. We describe how the complete system has been designed and built, explaining its different main components: the RL controller, the communications, and finally, the monitoring system. A previously learned RL controller has been tested solving a concrete problem with a determined state space modeling and discretization step. This physical realization validates our previous published works carried out through computer simulations, giving a strong argument in favor of the suitability of RL techniques to deal with real LMCRS systems.


Reinforcement learning Linked multicomponent robotic systems LMCRS Hose transport Proof of concept 



The research was supported by the Computational Intelligence Group of the Basque Country University (UPV/EHU) through Grant IT874-13 of Research Groups Call 2013-2017 (Basque Country Government).


  1. 1.
    Duro R, Graña M, de Lope J (2010) On the potential contributions of hybrid intelligent approaches to multicomponen robotic system development. Inf Sci 180(14):2635–2648CrossRefGoogle Scholar
  2. 2.
    Lopez-Guede JM, Graña M, Zulueta E (2008) On distributed cooperative control for the manipulation of a hose by a multirobot system. In: Corchado E, Abraham A, Pedrycz W (eds) Hybrid artificial intelligence systems. Lecture notes in artificial intelligence, vol 5271, pp 673–679. 3rd international workshop on hybrid artificial intelligence systems, pp 24–26. University of Burgos, Burgos, SpainGoogle Scholar
  3. 3.
    Echegoyen Z (2009) Contributions to visual servoing for legged and linked multicomponent robots. Ph.D. dissertation, UPV/EHUGoogle Scholar
  4. 4.
    Echegoyen Z, Villaverde I, Moreno R, Graña M, d’Anjou A (2010) Linked multi-component mobile robots: modeling, simulation and control. Rob Auton Syst 58(12, SI):1292–1305Google Scholar
  5. 5.
    Boor CD (1994) A practical guide to splines. SpringerGoogle Scholar
  6. 6.
    Rubin M (2000) Cosserat theories: shells. Kluwer, Rods and PointsCrossRefGoogle Scholar
  7. 7.
    Theetten A, Grisoni L, Andriot C, Barsky B (2008) Geometrically exact dynamic splines. Comput Aided Des 40(1):35–48CrossRefGoogle Scholar
  8. 8.
    Fernandez-Gauna B, Lopez-Guede J, Zulueta E (2010) Linked multicomponent robotic systems: basic assessment of linking element dynamical effect. In: Manuel Grana Romay MGS, Corchado ES (eds.) Hybrid artificial intelligence systems, Part I, vol 6076. Springer, pp 73–79Google Scholar
  9. 9.
    Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT PressGoogle Scholar
  10. 10.
    Bellman R (1957) A markovian decision process. Indiana Univ Math J 6:679–684CrossRefzbMATHGoogle Scholar
  11. 11.
    Tijms HC (2004) Discrete-time Markov decision processes. John Wiley & Sons Ltd, pp 233–277.
  12. 12.
    Watkins C (1989) Learning from delayed rewards. Ph.D. dissertation, University of Cambridge, EnglandGoogle Scholar
  13. 13.
    Watkins C, Dayan P (1992) Technical note: Q-learning. Mach Learn 8:279–292. doi: 10.1023/A:1022676722315.
  14. 14.
    Fernandez-Gauna B, Lopez-Guede J, Zulueta E, Graña M (2010) Learning hose transport control with q-learning. Neural Netw World 20(7):913–923Google Scholar
  15. 15.
    Graña M, Fernandez-Gauna B, Lopez-Guede J (2011) Cooperative multi-agent reinforcement learning for multi-component robotic systems: guidelines for future research. Paladyn. J Behav Rob 2:71–81. doi: 10.2478/s13230-011-0017-5.
  16. 16.
    Fernandez-Gauna B, Lopez-Guede JM, Zulueta E, Echegoyen Z, Graña M (2011) Basic results and experiments on robotic multi-agent system for hose deployment and transportation. Int J Artif Intell 6(S11):183–202Google Scholar
  17. 17.
    Fernandez-Gauna B, Lopez-Guede J, Graña M (2011) Towards concurrent q-learning on linked multi-component robotic systems. In: Corchado E, Kurzynski M, Wozniak M (eds) Hybrid artificial intelligent systems. Lecture notes in computer science, vol 6679. Springer, Berlin/Heidelberg, pp 463–470Google Scholar
  18. 18.
    Fernandez-Gauna B, Lopez-Guede J, Graña M (2011) Concurrent modular q-learning with local rewards on linked multi-component robotic systems. In: Ferrández J, Alvarez Sánchez J, de la Paz F, Toledo F (eds) Foundations on natural and artificial computation. Lecture notes in computer science, vol 6686. Springer, Berlin/Heidelberg, pp 148–155Google Scholar
  19. 19.
    Lopez-Guede JM, Fernandez-Gauna B, Graña M, Zulueta E (2011) Empirical study of q-learning based elemental hose transport control. In: Corchado E, Kurzynski M, Wozniak M (eds) Hybrid artificial intelligent systems. Lecture notes in computer science, vol 6679. Springer, Berlin/Heidelberg, pp 455–462Google Scholar
  20. 20.
    Lopez-Guede J, Fernandez-Gauna B, Graña M, Zulueta E (2012) Improving the control of single robot hose transport. Cybern Syst 43(4):261–275CrossRefGoogle Scholar
  21. 21.
    Lopez-Guede JM, Fernandez-Gauna B, Moreno R, Graña M (2012) Robotic vision: technologies for machine learning and vision applications. In: José García-Rodríguez MC (ed.) IGI GlobalGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jose Manuel Lopez-Guede
    • 1
    Email author
  • Julián Estévez
    • 1
  • Manuel Graña
    • 1
  1. 1.Computational Intelligence Group of the Basque Country University (UPV/EHU)San SebastianSpain

Personalised recommendations