Safe Robot Learning by Energy Limitation

  • Sigurd Mørkved Albrektsen
  • Sigurd Aksnes Fjerdingen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7508)


Online robot learning has been a goal for researchers for several decades. A problem arises when learning algorithms need to explore the environment as actions cannot easily be anticipated. Because of this, safety is a major issue when using learning algorithms.

This paper presents a framework for safe robot learning by the use of region-classification and energy limitation. The main task of the framework is to ensure safety regardless of a learning algorithm’s input to a system. This is necessary to allow a learning robot to explore environments without damaging itself or its surroundings. To ensure safety, the state-space is divided into fatal, supercritical, critical and safe regions, depending on the energy of the system.

To show the adaptability of the framework it is used on two different systems; an actuated swinging pendulum and a mobile platform. In both cases obstacles with unknown locations must are avoided successfully.


Reinforcement Learning Online Learning Energy Limitation Safe Region Mobile Platform 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Connell, J.H., Mahadevan, S.: Introduction to Robot Learning. Springer (1993)Google Scholar
  2. 2.
    Olivier Chapelle, A.Z., Schölkopf, B.: Semi-Supervised Learning. The MIT Press (2006)Google Scholar
  3. 3.
    Gelly, S., Silver, D.: Combining online and offline knowledge in uct. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 273–280. ACM, New York (2007)CrossRefGoogle Scholar
  4. 4.
    Sutton, R.S., Barto, A.G.: Reinforcement learning. Journal of Cognitive Neuroscience 11(1), 126–130 (1999)CrossRefGoogle Scholar
  5. 5.
    Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  6. 6.
    Gillula, J.H., Tomlin, C.J.: Guaranteed safe online learning of a bounded system. In: IROS 2011, pp. 2979–2984 (September 2011)Google Scholar
  7. 7.
    Hans, A., Schneegaß, D., Schäfer, A.M., Udluft, S.: Safe exploration for reinforcement learning. In: European Symposium on Artificial Neural Network, pp. 143–148 (April 2008)Google Scholar
  8. 8.
    Fjerdingen, S.A., Kyrkjebø, E.: Safe reinforcement learning for continuous spaces through Lyapunov-constrained behavior. In: Frontiers in Artificial Intelligence and Applications, pp. 70–79 (May 2011)Google Scholar
  9. 9.
    Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3, 803–832 (2003)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Sigurd Mørkved Albrektsen
    • 1
  • Sigurd Aksnes Fjerdingen
    • 1
  1. 1.Dept. of Applied CyberneticsSINTEF ICTNorway

Personalised recommendations