Learning Visual Representations for Interactive Systems

  • Justus Piater
  • Sébastien Jodogne
  • Renaud Detry
  • Dirk Kraft
  • Norbert Krüger
  • Oliver Krömer
  • Jan Peters
Part of the Springer Tracts in Advanced Robotics book series (STAR, volume 70)


We describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a nonparametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.


Visual Feature Markov Decision Process Perceptual Space Binary Decision Diagram Markov Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bellman, R.: Dynamic programming. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  2. 2.
    Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)Google Scholar
  3. 3.
    Breiman, L., Friedman, J., Stone, C.: Classification and Regression Trees. Wadsworth International Group (1984)Google Scholar
  4. 4.
    Bryant, R.: Symbolic Boolean manipulation with ordered binary decision diagrams. ACM Computing Surveys 24(3), 293–318 (1992)CrossRefGoogle Scholar
  5. 5.
    Detry, R., Başeski, E., Popović, M., Touati, Y., Krüger, N., Kroemer, O., Peters, J., Piater, J.: Learning Object-specific Grasp Affordance Densities. In: International Conference on Development and Learning (2009)Google Scholar
  6. 6.
    Detry, R., Pugeault, N., Piater, J.: A Probabilistic Framework for 3D Visual Object Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1790–1803 (2009)Google Scholar
  7. 7.
    Gouet, V., Boujemaa, N.: Object-based queries using color points of interest. In: IEEE Workshop on Content-Based Access of Image and Video Libraries, Kauai, HI, USA, pp. 30–36 (2001)Google Scholar
  8. 8.
    Jodogne, S., Piater, J.: Interactive Learning of Mappings from Visual Percepts to Actions. In: 22nd International Conference on Machine Learning, pp. 393–400 (2005)Google Scholar
  9. 9.
    Jodogne, S., Piater, J.: Learning, then Compacting Visual Policies. In: 7th European Workshop on Reinforcement Learning, Naples, Italy, pp. 8–10 (2005)Google Scholar
  10. 10.
    Jodogne, S., Piater, J.: Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 222–233. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Jodogne, S., Piater, J.: Closed-Loop Learning of Visual Control Policies. Journal of Artificial Intelligence Research 28, 349–391 (2007)zbMATHGoogle Scholar
  12. 12.
    Jodogne, S., Scalzo, F., Piater, J.: Task-Driven Learning of Spatial Combinations of Visual Features. In: Proc. of the IEEE Workshop on Learning in Computer Vision and Pattern Recognition, Workshop at CVPR, San Diego, CA, USA (2005)Google Scholar
  13. 13.
    Kraft, D., Pugeault, N., Başeski, E., Popović, M., Kragić, D., Kalkan, S., Wörgötter, F., Krüger, N.: Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object Action Complexes. International Journal of Humanoid Robotics 5, 247–265 (2008)CrossRefGoogle Scholar
  14. 14.
    Krüger, N., Lappe, M., Wörgötter, F.: Biologically Motivated Multimodal Processing of Visual Primitives. Interdisciplinary Journal of Artificial Intelligence and the Simulation of Behaviour 1(5), 417–428 (2004)Google Scholar
  15. 15.
    Nene, S., Nayar, S., Murase, H.: Columbia Object Image Library (COIL-100). Tech. Rep. CUCS-006-96, Columbia University, New York (1996)Google Scholar
  16. 16.
    Popović, M., Kraft, D., Bodenhagen, L., Başeski, E., Pugeault, N., Kragić, D., Krüger, N.: An Adaptive Strategy for Grasping Unknown Objects Based on Co-planarity and Colour Information (submitted)Google Scholar
  17. 17.
    Pugeault, N.: Early Cognitive Vision: Feedback Mechanisms for the Disambiguation of Early Visual Representation. Vdm Verlag Dr. Müller (2008)Google Scholar
  18. 18.
    Pugeault, N., Wörgötter, F., Krüger, N.: Accumulated Visual Representation for Cognitive Vision. In: British Machine Vision Conference (2008)Google Scholar
  19. 19.
    Samuel, A.: Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development 3(3), 210–229 (1959)CrossRefGoogle Scholar
  20. 20.
    Sudderth, E., Ihler, A., Freeman, W., Willsky, A.: Nonparametric Belief Propagation. In: Computer Vision and Pattern Recognition, vol. I, pp. 605–612 (2003)Google Scholar
  21. 21.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  22. 22.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  23. 23.
    Watkins, C.: Learning From Delayed Rewards. Ph.D. thesis, King’s College, Cambridge, UK (1989)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Justus Piater
    • 1
  • Sébastien Jodogne
    • 1
  • Renaud Detry
    • 1
  • Dirk Kraft
    • 2
  • Norbert Krüger
    • 2
  • Oliver Krömer
    • 3
  • Jan Peters
    • 3
  1. 1.INTELSIG LaboratoryUniversité de LiègeBelgium
  2. 2.University of Southern DenmarkDenmark
  3. 3.Max Planck Institute for Biological CyberneticsTübingenGermany

Personalised recommendations