Interactive Selection of Visual Features through Reinforcement Learning

  • Sébastien Jodogne
  • Justus H. Piater


We introduce a new class of Reinforcement Learning algorithms designed to operate in perceptual spaces containing images. They work by classifying the percepts using a computer vision algorithm specialized in image recognition, hence reducing the visual percepts to a symbolic class. This approach has the advantage of overcoming to some extent the curse of dimensionality by focusing the attention of the agent on distinctive and robust visual features.

The visual classes are learned automatically in a process that only relies on the reinforcement earned by the agent during its interaction with the environment. In this sense, the visual classes are learned interactively in a task-driven fashion, without an external supervisor. We also show how our algorithms can be extended to perceptual spaces, large or even continuous, upon which it is possible to define features.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    R. Bellman. Dynamic Programming. Princeton University Press, 1957.Google Scholar
  2. [2]
    D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, Belmont, MA, 1996.MATHGoogle Scholar
  3. [3]
    L. Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In National Conference on Artificial Intelligence, pages 183–188, 1992.Google Scholar
  4. [4]
    D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning, 2004. Submitted for publication.Google Scholar
  5. [5]
    V. Gouet and N. Boujemaa. Object-based queries using color points of interest. In IEEE Workshop on Content-Based Access of Image and Video Libraries, pages 30–36, Kauai, Hawaii, USA, 2001.Google Scholar
  6. [6]
    M. Huber and R. Grupen. A control structure for learning locomotion gaits. In 7th Int. Symposium on Robotics and Applications, Anchorage, AK, May 1998. TSI Press.Google Scholar
  7. [7]
    L.P. Kaelbling, M.L. Littman, and A. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237–285, 1996.Google Scholar
  8. [8]
    T.K. Leung, M.C. Burl, and P. Perona. Finding faces in cluttered scenes using random labeled graph matching. In Proc. of the Fifth International Conference on Computer Vision, page 637. IEEE Computer Society, 1995.Google Scholar
  9. [9]
    R.A. McCallum. Reinforcement learning with selective perception and Hidden State. PhD thesis, University of Rochestor, Rochestor, New York, 1996.Google Scholar
  10. [10]
    M. McCarty, R. Clifton, D. Ashmead, P. Lee, and N. Goubet. How infants use vision for grasping objects. Child Development, 72:973–987, 2001.CrossRefGoogle Scholar
  11. [11]
    K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 257–263, Madison, Wisconsin, June 2003.Google Scholar
  12. [12]
    T.M. Mitchell. Machine Learning. McGraw Hill, 1997.Google Scholar
  13. [13]
    S.A. Nene, S.K. Nayar, and H. Murase. Columbia object image library (COIL-100). Technical Report CUCS-006-96, Columbia University, New York, NY, February 1996.Google Scholar
  14. [14]
    D. Ormoneit and S. Sen. Kernel-based reinforcement learning. Machine learning, 49(2–3):161–178, 2002.MATHCrossRefGoogle Scholar
  15. [15]
    J.H. Piater. Visual Feature Learning. PhD thesis, Computer Science Department, University of Massachusetts, Amherst, MA, February 2001.Google Scholar
  16. [16]
    L.D. Pyeatt and A.E. Howe. Decision tree function approximation in reinforcement learning. In Proc. of the Third International Symposium on Adaptive Systems, pages 70–77, Havana, Cuba, March 2001.Google Scholar
  17. [17]
    J.R. Quinlan. The effect of noise on concept learning. In Machine Learning: An Artificial Intelligence Approach: Volume II, pages 149–166. Kaufmann, Los Altos, CA, 1986.Google Scholar
  18. [18]
    C. Schmid and R. Mohr. Local greyvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530–535, 1997.CrossRefGoogle Scholar
  19. [19]
    C. Schmid, R. Mohr, and C. Bauckhage. Evaluation of interest point detectors. International Journal of Computer Vision, 37(2):151–172, 2000.MATHCrossRefGoogle Scholar
  20. [20]
    P.G. Schyns and L. Rodet. Categorization creates functional features. Journal of Experimental Psychology: Learning, Memory and Cognition, 23(3):681–696, 1997.CrossRefGoogle Scholar
  21. [21]
    R.S. Sutton. Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In Proc. of 7th Int. Conference on Machine Learning, pages 216–224, San Mateo, CA, 1990.Google Scholar
  22. [22]
    R.S. Sutton and A.G. Barto. Reinforcement Learning, an Introduction. MIT Press, 1998.Google Scholar
  23. [23]
    G. Tesauro. Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3):58–68, March 1995.CrossRefGoogle Scholar
  24. [24]
    C. Watkins and P. Dayan. Q-learning. Machine learning, 8:279–292, 1992.MATHGoogle Scholar
  25. [25]
    S.D. Whitehead and D.H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45–83, 1991.Google Scholar

Copyright information

© Springer-Verlag London Limited 2005

Authors and Affiliations

  • Sébastien Jodogne
    • 1
  • Justus H. Piater
    • 1
  1. 1.Montefiore Institute (B28)University of LiègeLiègeBelgium

Personalised recommendations