What Can We Learn from Biological Vision Studies for Human Motion Segmentation?

  • Cheng Chen
  • Guoliang Fan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4292)


We review recent biological vision studies that are related to human motion segmentation. Our goal is to develop a practically plausible computational framework that is guided by recent cognitive and psychological studies on the human visual system for the segmentation of human body in a video sequence. Specifically, we discuss the roles and interactions of bottom-up and top-down processes in visual perception processing as well as how to combine them synergistically in one computational model to guide human motion segmentation. We also examine recent research on biological movement perception, such as neural mechanisms and functionalities for biological movement recognition and two major psychological tracking theories. We attempt to develop a comprehensive computational model that involves both bottom-up and top-down processing and is deeply inspired by biological motion perception. According to this model, object segmentation, motion estimation, and action recognition are results of recurrent feedforward (bottom-up) and feedback (top-down) processes. Some open technical questions are also raised and discussed for future research.


Action Recognition Human Visual System Motion Perception Object Segmentation Multiple Object Tracking 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Johansson, G.: Visual perception of biological motion and a model for its analysis. Perception & Psychophysics 14, 201–211 (1973)CrossRefGoogle Scholar
  2. 2.
    Troje, N.: Decomposing biological motion: A framework for analysis and synthesis of human gait patterns. Journal of Vision 2, 371–387 (2002)CrossRefGoogle Scholar
  3. 3.
    Thornton, I., Pinto, J., Shiffrar, M.: The visual perception of human locomotion. Cognitive Neuropsychology, 535–552 (1998)Google Scholar
  4. 4.
    Knoblich, G., Thornton, I.M., Grosjean, M., Shiffrar, M.: The human body. Perception from the inside out. Oxford University Press, New York (2006)Google Scholar
  5. 5.
    Gepshtein, S., Kubovy, M.: The emergence of visual objects in space-time. Proceedings of the National Academy of Sciences of the United States of America 97, 8186–8191 (2000)CrossRefGoogle Scholar
  6. 6.
    Kubovy, M., Gepshtein, S.: Grouping in space and in space-time: An exercise in phenomenological psychophysics. In: Behrmann, M., Kimchi, R., Olson, C. (eds.) Perceptual Organization in Vision: Behavioral and Neural perspectives, pp. 45–85. Lawrence Erlbaum Association, Mahwah (2003)Google Scholar
  7. 7.
    Kubovy, M., Gepshtein, S.: Gestalt: from phenomena to laws. In: Perceptual Organization for Artificial Vision Systems, pp. 41–71. Academic Publishers, Boston (2000)Google Scholar
  8. 8.
    Olivers, C.N., Humphreys, G.: Spatiotemporal segregation in visual search: evidence from parietal lesions. Journal of Experimental Psychology: Human Perception and Performance 30, 667–688 (2004)CrossRefGoogle Scholar
  9. 9.
    Ullman, S.: The Interpretation of Visual Motion. MIT Press, Cambridge (1979)Google Scholar
  10. 10.
    Marr, D.: Vision: a Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman and Company, New York (1982)Google Scholar
  11. 11.
    Palmer, S.E., Rock, I.: Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin and Review 1, 29–55 (1994)CrossRefGoogle Scholar
  12. 12.
    Palmer, S.: Vision Science: Photons to Phenomenology. MIT Press, Bradford Books (1999)Google Scholar
  13. 13.
    McClelland, J.L.: On the time relations of mental processes: An examination of systems of processes in cascade. Psychological Review 86, 287–330 (1979)CrossRefGoogle Scholar
  14. 14.
    Stillings, N.A., Weisler, S.E., Chase, C.H., Feinstein, M.H., Garfield, J.L., Rissland, E.L.: Cognitive Science: An Introduction. MIT Press, Cambridge (1995)Google Scholar
  15. 15.
    Palmer, S.E., Brooks, J.L., Nelson, R.: When does grouping happen? Acta Psychologica 114, 311–330 (2003)CrossRefGoogle Scholar
  16. 16.
    Clifford, C.W., Freedman, J., Vaina, L.M.: First- and second-order motion perception in gabor micropattern stimuli: Psychophysical and computational modelling. Cogn. Brain Res. 6, 263–271 (1998)CrossRefGoogle Scholar
  17. 17.
    Braddick, O.J.: A short-range process in apparent motion. Vision Res. 14, 519–527 (1974)CrossRefGoogle Scholar
  18. 18.
    Thornton, I.M., Pinto, J., Shiffrar, M.: The visual perception of human locomotion. Cognitive Neuropsychology 15, 535–552 (1998)CrossRefGoogle Scholar
  19. 19.
    Franconeri, S.L., Halberda, J., Feigenson, L., Alvarez, G.A.: Common fate can define objects in multiple object tracking. Journal of Vision 4, 365a (2004)CrossRefGoogle Scholar
  20. 20.
    Mumford, D.: Neuronal architecture for pattern-theoretic problems. MIT Press, Cambridge (1993)Google Scholar
  21. 21.
    Bullier, J.: Integrated model of visual processing. Brain Reseach Reviews 36, 96–107 (2001)CrossRefGoogle Scholar
  22. 22.
    Ullman, S.: High-level vision: object recognition and visual cognition. MIT Press, Cambridge (1996)MATHGoogle Scholar
  23. 23.
    Kersten, D., Mamassian, P., Yuille, A.: Object perception as bayesian inference. Annual Review of Psychology 55, 271–304 (2004)CrossRefGoogle Scholar
  24. 24.
    Knill, D.C., Richards, W.: Perception as Bayesian Inference. Cambridge Univ. Press, UK (1996)MATHGoogle Scholar
  25. 25.
    Rao, R.P.N., Olshausen, B., Lewicki, M.: Probabilistic Models of the Brain: Perception and Neural Function. MIT Press, Cambridge (2002)Google Scholar
  26. 26.
    Lee, T.S., Mumford, D.: Hierarchical bayesian inference in the visual cortex. Journal of Optical Society of America 20, 1434–1448 (2003)CrossRefGoogle Scholar
  27. 27.
    Vecera, S.P., O’Reilly, R.C.: Figure-ground organization and object recognition processes: An interactive account. Journal of Experimental Psychology: Human Perception and Performance 24, 441–462 (1998)CrossRefGoogle Scholar
  28. 28.
    Shi, J., Malik, J.: Motion segmentation and tracking using Normalized cuts. In: Proc. of Int. Conf. on Computer Vision, pp. 1151–1160 (1998)Google Scholar
  29. 29.
    Fowlkes, C., Belongie, S., Malik, J.: Efficient spatiotemporal grouping using the Nystrom method. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 231–238 (2001)Google Scholar
  30. 30.
    DeMenthon, D., Megret, R.: Spatio-temporal segmentation of video by hierarchical mean shift analysis. Technical Report: LAMP-TR-090/CAR-TR-978/CS-TR-4388/UMIACS-TR-2002-68 (2002)Google Scholar
  31. 31.
    Greenspan, H., Goldberger, J., Mayer, A.: A probabilistic framework for spatio-temporal video representation amp indexing. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 461–475. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  32. 32.
    Megret, R., De Menthon, D.: A survey of spatio-temporal grouping techniques. Technical report, University of Maryland, College Park (2002), http://www.umiacs.umd.edu/lamp/pubs/TechReports/
  33. 33.
    Moscheni, F., Bhattacharjee, S., Kunt, M.: Spatiotemporal segmentation based on region merging. IEEE Trans. Pattern Anal. Mach. Intell. 20, 897–915 (1998)CrossRefGoogle Scholar
  34. 34.
    Mezaris, V., Kompatsiaris, I., Strintzis, M.G.: Video object segmentation using bayes-based temporal tracking and trajectory-based region merging. IEEE Trans. Circuits and Systems for Video Technology 14, 782–795 (2004)CrossRefGoogle Scholar
  35. 35.
    Gelgon, M., Bouthemy, P.: A region-level motion-based graph representation and labeling for tracking a spatial image partition. Pattern Recognition 33, 725–740 (2000)CrossRefGoogle Scholar
  36. 36.
    Wang, D.: Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Trans. Circuits and Systems for Video Technology, 539–546 (1998)Google Scholar
  37. 37.
    Porikli, F., Wang, Y.: Automatic video object segmentation using volume growing and hierarchical clustering. Journal on Applied Signal Processin 3, 442–453 (2004)Google Scholar
  38. 38.
    Tsai, Y., Lai, C., Hung, Y., Shih, Z.: A bayesian approach to video object segmentation. IEEE Trans. Circuits syst. video Technology 15, 175–180 (2005)CrossRefGoogle Scholar
  39. 39.
    Hochstein, S., Ahissar, M.: View from the top: Herarchies and reverse hierarchies in the visual system. Neuron 36, 791–804 (2002)CrossRefGoogle Scholar
  40. 40.
    Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (2004)Google Scholar
  41. 41.
    Tu, Z., Chen, X., Yuille, A.L., Zhu, S.C.: Image parsing: Unifying segmentation, detection, and recognition. Int’l Journal of Computer Vision 63, 113–140 (2005)CrossRefGoogle Scholar
  42. 42.
    Ungerleider, L.G., Mishkin, M.: Two cortical visual systems, pp. 549–586. MIT Press, Cambridge (1982)Google Scholar
  43. 43.
    Felleman, D.J., van Essen, D.C.: Distributed hierarchical processing in primate cerebral cortex. Cerebral cortex 1, 1–47 (1991)CrossRefGoogle Scholar
  44. 44.
    Burr, D., Ross, J.: Vision: The world through picket fences. Current Biology 14, 381–382 (2004)CrossRefGoogle Scholar
  45. 45.
    Giese, M.A., Poggio, T.: Neural Mechanisms for the Recognition of Biological Movement. Nature Neuroscience Review 4, 179–192 (2003)CrossRefGoogle Scholar
  46. 46.
    Oram, M.W., Perrett, D.I.: Integration of form and motion in the anterior part of the superior temporal polysensory area (STPa) of the macaque monkey. Journal of neurophysiology 76, 109–129 (1996)Google Scholar
  47. 47.
    Sajda, P., Baek, K.: Integration of form and motion within a generative model of visual cortex. Neural Networks 17, 809–821 (2004)CrossRefGoogle Scholar
  48. 48.
    Bullier, J.: Integrated model of visual processing. Brain research review 36, 96–107 (2001)CrossRefGoogle Scholar
  49. 49.
    Kahneman, D., Terisman, A., Gibbs, B.J.: The reviewing of object files: object specific integration of information. Cognitive Psychology 24, 175–219 (1992)CrossRefGoogle Scholar
  50. 50.
    Pylyshyn, Z.W., Storm, R.W.: Tracking multiple independent target: Evidence for a parallel tracking mechanism. Spatial Vision 3, 1–19 (1988)CrossRefGoogle Scholar
  51. 51.
    Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. IEEE Int’l Conference on Computer Vision (2003)Google Scholar
  52. 52.
    Lim, H., Morariu, V., Camps, O.I., Sznaier, M.: Dynamic appearance modeling for human tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  53. 53.
    Tu, Z., Zhu, S.C.: Image segmentation by data-driven markov chain monte carlo. IEEE Trans. on Pattern Anal. Mach. Intell. 24, 657–673 (2002)CrossRefGoogle Scholar
  54. 54.
    Micilotta, A., Bowden, R.: View-based location and tracking of body parts for visual interaction. In: Proc. of British Machine Vision Conference, pp. 849–858 (2004)Google Scholar
  55. 55.
    Zhou, S.K., Chellappa, R., Moghaddam, B.: Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans. Image Processing 13, 1491–1506 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cheng Chen
    • 1
  • Guoliang Fan
    • 1
  1. 1.School of Electrical and Computer EngineeringOklahoma State UniversityStillwaterUSA

Personalised recommendations