Learn to Swing Up and Balance a Real Pole Based on Raw Visual Input Data

  • Jan Mattner
  • Sascha Lange
  • Martin Riedmiller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7667)


For the challenging pole balancing task we propose a system which uses raw visual input data for reinforcement learning to evolve a control strategy. Therefore we use a neural network – a deep autoencoder – to encode the camera images and thus the system states in a low dimensional feature space. The system is compared to controllers that work directly on the motor sensor data. We show that the performances of both systems are settled in the same order of magnitude.


Neural network Pole balancing Deep autoencoder Reinforcement learning Visual input 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Riedmiller, M., Lange, S., Voigtlaender, A.: Autonomous Reinforcement Learning on Raw Visual Input Data in a Real World Application. In: International Joint Conference on Neural Networks (2012)Google Scholar
  2. 2.
    Ormoneit, D., Sen, Ś.: Kernel-Based Reinforcement Learning. Mach. Learn. 49, 161–178 (2002)zbMATHCrossRefGoogle Scholar
  3. 3.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  4. 4.
    Riedmiller, M.: Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 317–328. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Riedmiller, M.: Neural Reinforcement Learning to Swing-Up and Balance a Real Pole. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 3191–3196. IEEE Press, New York (2005)CrossRefGoogle Scholar
  6. 6.
    Tolat, V.V., Widrow, B.: An Adaptive ’Broom Balancer’ with Visual Inputs. In: IEEE International Conference on Neural Networks, pp. 641–647 (1988)Google Scholar
  7. 7.
    Wenzel, L., Vazquez, N., Nair, D., Jamal, R.: Computer Vision Based Inverted Pendulum. In: Proceedings of the 17th IEEE Instrumentation and Measurement Technology Conference, pp. 1319–1323 (2000)Google Scholar
  8. 8.
    Wang, H., Chamroo, A., Vasseur, C., Koncar, V.: Hybrid Control for Vision Based Cart-Inverted Pendulum System. In: American Control Conference, pp. 3845–3850 (2008)Google Scholar
  9. 9.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Science 313, 504–507 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Ciresan, D.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition. Neural Comput. 22, 3207–3220 (2010)CrossRefGoogle Scholar
  11. 11.
    Katsikopoulos, K.V., Engelbrecht, S.E.: Markov Decision Processes with Delays and Asynchronous Cost Collection. IEEE Trans. Autom. Control 48, 568–574 (2003)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jan Mattner
    • 1
  • Sascha Lange
    • 1
  • Martin Riedmiller
    • 1
  1. 1.Machine Learning Lab, Department of Computer ScienceUniversity of FreiburgFreiburgGermany

Personalised recommendations