Action Recognition with a Bio–inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions

  • Maria-Jose Escobar
  • Pierre Kornprobst
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5305)


Here we show that reproducing the functional properties of MT cells with various center–surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio–inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure and, more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio–inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos.


Motion Estimation Action Recognition Direction Selectivity Motion Representation Average Recognition Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Gavrila, D.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)zbMATHCrossRefGoogle Scholar
  2. 2.
    Goncalves, L., DiBernardo, E., Ursella, E., Perona, P.: Monocular tracking of the human arm in 3D. In: Proceedings of the 5th International Conference on Computer Vision, June 1995, pp. 764–770 (1995)Google Scholar
  3. 3.
    Mokhber, A., Achard, C., Milgram, M.: Recognition of human behavior by space-time silhouette characterization. Pattern Recognition Letters 29(1), 81–89 (2008)CrossRefGoogle Scholar
  4. 4.
    Seitz, S., Dyer, C.: View-invariant analysis of cyclic motion. The International Journal of Computer Vision 25(3), 231–251 (1997)CrossRefGoogle Scholar
  5. 5.
    Collins, R., Gross, R., Shi, J.: Silhouette-based human identification from body shape and gait. In: 5th Intl. Conf. on Automatic Face and Gesture Recognition, p. 366 (2002)Google Scholar
  6. 6.
    Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proceedings of CVPR 2001, vol. 2, pp. 123–128 (2001)Google Scholar
  7. 7.
    Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: Proceedings of the 9th International Conference on Computer Vision, vol. 2, pp. 726–734 (October 2003)Google Scholar
  8. 8.
    Laptev, I., Capuo, B., Schultz, C., Lindeberg, T.: Local velocity-adapted motion events for spatio-temporal recognition. Computer Vision and Image Understanding 108(3), 207–229 (2007)CrossRefGoogle Scholar
  9. 9.
    Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)Google Scholar
  10. 10.
    Michels, L., Lappe, M., Vaina, L.: Visual areas involved in the perception of human movement from dynamic analysis. Brain Imaging 16(10), 1037–1041 (2005)Google Scholar
  11. 11.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial–temporal words. Internation Journal of Computer Vision 79(3), 299–318 (2008)CrossRefGoogle Scholar
  12. 12.
    Wong, S.F., Kim, T.K., Cipolla, R.: Learning motion categories using both semantic and structural information. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (June 2007)Google Scholar
  13. 13.
    Giese, M., Poggio, T.: Neural mechanisms for the recognition of biological movements and actions. Nature Reviews Neuroscience 4, 179–192 (2003)CrossRefGoogle Scholar
  14. 14.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proceedings of the 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  15. 15.
    Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 994–1000 (June 2005)Google Scholar
  16. 16.
    Xiao, D.K., Raiguel, S., Marcar, V., Orban, G.A.: The spatial distribution of the antagonistic surround of MT/V5 neurons. Cereb Cortex 7(7), 662–677 (1997)CrossRefGoogle Scholar
  17. 17.
    Xiao, D., Raiguel, S., Marcar, V., Koenderink, J., Orban, G.A.: Spatial heterogeneity of inhibitory surrounds in the middle temporal visual area. Proceedings of the National Academy of Sciences 92(24), 11303–11306 (1995)CrossRefGoogle Scholar
  18. 18.
    Escobar, M., Masson, G., Kornprobst, P.: A simple mechanism to reproduce the neural solution of the aperture problem in monkey area MT. Research Report 6579, INRIA (2008)Google Scholar
  19. 19.
    Tsotsos, J., Liu, Y., Martinez-Trujillo, J., Pomplun, M., Simine, E., Zhou, K.: Attending to visual motion. Computer Vision and Image Understanding 100, 3–40 (2005)CrossRefGoogle Scholar
  20. 20.
    Nowlan, S., Sejnowski, T.: A selection model for motion processing in area MT of primates. J. Neuroscience 15, 1195–1214 (1995)Google Scholar
  21. 21.
    Rust, N., Mante, V., Simoncelli, E., Movshon, J.: How MT cells analyze the motion of visual patterns. Nature Neuroscience (11), 1421–1431 (2006)CrossRefGoogle Scholar
  22. 22.
    Simoncelli, E.P., Heeger, D.: A model of neuronal responses in visual area MT. Vision Research 38, 743–761 (1998)CrossRefGoogle Scholar
  23. 23.
    Grzywacz, N., Yuille, A.: A model for the estimate of local image velocity by cells on the visual cortex. Proc. R. Soc. Lond. B. Biol. Sci. 239(1295), 129–161 (1990)Google Scholar
  24. 24.
    Berzhanskaya, J., Grossberg, S., Mingolla, E.: Laminar cortical dynamics of visual form and motion interactions during coherent object motion perception. Spatial Vision 20(4), 337–395 (2007)CrossRefGoogle Scholar
  25. 25.
    Bayerl, P., Neumann, H.: Disambiguating visual motion by form–motion interaction – a computational model. International Journal of Computer Vision 72(1), 27–45 (2007)CrossRefGoogle Scholar
  26. 26.
    Adelson, E., Bergen, J.: Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A 2, 284–299 (1985)Google Scholar
  27. 27.
    Carandini, M., Demb, J.B., Mante, V., Tollhurst, D.J., Dan, Y., Olshausen, B.A., Gallant, J.L., Rust, N.C.: Do we know what the early visual system does? Journal of Neuroscience 25(46), 10577–10597 (2005)CrossRefGoogle Scholar
  28. 28.
    Robson, J.: Spatial and temporal contrast-sensitivity functions of the visual system. J. Opt. Soc. Am. 69, 1141–1142 (1966)CrossRefGoogle Scholar
  29. 29.
    Albrecht, D., Geisler, W., Crane, A.: Nonlinear properties of visual cortex neurons: Temporal dynamics, stimulus selectivity, neural performance, pp. 747–764. MIT Press, Cambridge (2003)Google Scholar
  30. 30.
    Destexhe, A., Rudolph, M., Paré, D.: The high-conductance state of neocortical neurons in vivo. Nature Reviews Neuroscience 4, 739–751 (2003)CrossRefGoogle Scholar
  31. 31.
    Priebe, N., Cassanello, C., Lisberger, S.: The neural representation of speed in macaque area MT/V5. Journal of Neuroscience 23(13), 5650–5661 (2003)Google Scholar
  32. 32.
    Perrone, J., Thiele, A.: Speed skills: measuring the visual speed analyzing properties of primate mt neurons. Nature Neuroscience 4(5), 526–532 (2001)Google Scholar
  33. 33.
    Liu, J., Newsome, W.T.: Functional organization of speed tuned neurons in visual area MT. Journal of Neurophysiology 89, 246–256 (2003)CrossRefGoogle Scholar
  34. 34.
    Perrone, J.: A visual motion sensor based on the properties of V1 and MT neurons. Vision Research 44, 1733–1755 (2004)Google Scholar
  35. 35.
    Huang, X., Albright, T.D., Stoner, G.R.: Adaptive surround modulation in cortical area MT. Neuron. 53, 761–770 (2007)CrossRefGoogle Scholar
  36. 36.
    Topsoe, F.: Some inequalities for information divergence and related measures of discrimination. IEEE Transactions on information theory 46(4), 1602–1609 (2000)CrossRefMathSciNetGoogle Scholar
  37. 37.
    Zelnik-Manor, L., Irani, M.: Statistical analysis of dynamic actions. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1530–1535 (2006)CrossRefGoogle Scholar
  38. 38.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Proceedings of the 10th International Conference on Computer Vision 2, 1395–1402 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Maria-Jose Escobar
    • 1
  • Pierre Kornprobst
    • 1
  1. 1.Odyssée project team INRIASophia-AntipolisFrance

Personalised recommendations