Abstract
The neural mechanisms underlying motion segregation and integration still remain unclear to a large extent. Local motion estimates often are ambiguous in the lack of form features, such as corners or junctions. Furthermore, even in the presence of such features, local motion estimates may be wrong if they were generated near occlusions or from transparent objects. Here, a neural model of visual motion processing is presented that involves early stages of the cortical dorsal and ventral pathways. We investigate the computational mechanisms of V1-MT feedforward and feedback processing in the perception of coherent shape motion. In particular, we demonstrate how modulatory MT-V1 feedback helps to stabilize localized feature signals at, e.g. corners, and to disambiguate initial flow estimates that signal ambiguous movement due to the aperture problem for single shapes. In cluttered environments with multiple moving objects partial occlusions may occur which, in turn, generate erroneous motion signals at points of overlapping form. Intrinsic-extrinsic region boundaries are indicated by local T-junctions of possibly any orientation and spatial configuration. Such junctions generate strong localized feature tracking signals that inject erroneous motion directions into the integration process. We describe a simple local mechanism of excitatory form-motion interaction that modifies spurious motion cues at T-junctions. In concert with local competitive-cooperative mechanisms of the motion pathway the motion signals are subsequently segregated into coherent representations of moving shapes. Computer simulations demonstrate the competency of the proposed neural model.
Similar content being viewed by others
References
Adelson, E.H. and Movshon, J. 1982. Phenomenal coherence of moving visual patterns. Nature, 300:523–525.
Adelson, E.H. and Bergen, J. 1985. Spatiotemporal energy models for the perception of motion. Optical Society of America, A 2(2):284–299.
Baek, K. and Sajda, P. 2003. A probabilistic network model for integrating visual cues and inferring intermediate-level representations. IEEE Workshop on Statistical and Computational Theories of Vision (SCTV’03). Online document:http://www.stat.ucla.edu/~yuille/meetings/2003_workshop.php
Barron, J.L., Fleet, D.J. and Beauchemin, S.S. 1994. Performance of optical flow techniques. Int. J. Computer Vision, 12(1):43–77.
Bayerl, P. and Neumann, H. 2004) Disambiguating Visual Motion through Contextual Feedback Modulation. Neural Computation, 16(10):2041–2066.
Brox, T., Bruhn, A., Papenberg, N. and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. In Pajdla, T., Matas, J., editors, Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic.
Crick, F. and Koch, C. 1998. Constraints on cortical and thalamic projections: The no-strong-loop hypothesis. Nature, 391:245–250.
Dempster, A.P., Laird, N.M. and Rubin, D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B, 39:1–38.
Desimone, R. and Duncan, J. 1995. Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18:193–222.
Grossberg, S. 1980. How does a brain build a cognitive code? Psychological Review, 87:1–51.
Grossberg, S., Mingolla, E. and Viswanathan, L. 2001. Neural dynamics of motion integration and segmentation within and across apertures. Vision Research, 41:2521–2553.
Hansen, T. and Neumann, H. 2004. Neural mechanisms for the robust detection of junctions. Neural Computation, 16(5):1013–1037.
Heydt R., von der, Peterhans E. and Baumgartner G. 1984. Illusory contours and cortical neuron responses. Science 224: 1260–1262.
Hough, P.V.C. 1959. Machine Analysis of Bubble Chamber Pictures. International Conference on High Energy Accelerators and Instrumentation, CERN.
Hupé, J.M., James, A.C., Girard, P., Lomber, S.G., Payne, B.R. and Bullier, J. 2001. Feedback connections act on the early part of the responses in monkey visual cortex. J. of Neurophys, 85:134–145.
Kalkan, S., Calow, D., Felsberg, M., Worgotter, F., Lappe, M. and Kruger, N. 2004. Optic Flow Statistics and Intrinsic Dimensionality. Proc. ‘Brain Inspired Cognitive Systems 2004’ (Stirling, Scotland).
Kapadia, M.K., Westheimer, G. and Gilbert, C.D. 2000. Spatial distribution of contextual interactions in primary visual cortex and in visual perception. J. Neurophysiol, 84(4):2048–2062.
Koechlin, E., Anton, J.L. and Burnod, Y. 1999. Bayesian inference in populations of cortical neurons: A model of motion integration and segregation in area MT. Biological Cybernetics, 80:25–44.
Lee, T.S. and Mumford, D. 2003. Hierarchical Bayesian inference in the visual cortex. Journal of Optical Society of America, A. 20(7):1434–1448.
Lidén, L. and Pack, C.C. 1999. The role of terminators and occlusion in motion integration and segmentation: A neural solution. Vision Research, 39:3301–3320.
Lukas, B.D. and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. In Image Understanding Workshop.
Marr, D. 1982. Vision. H. Freeman and Co.
McDermott, J., Weiss, Y. and Adelson, E.H. 2001. Beyond junctions: Nonlocal form contraints on motion interpretation. Perception, 30:905–923.
Medioni, G., Lee, M.S. and Tang, C.K. 2000. A Computational Framework for Segmentation and Grouping. Elsevier Science.
Metelli, F. 1974. The Perception of Transparency, Scientific American, 230:90–98.
Nagel, H. and Enkelmann, W. 1986. An investigation of smoothness constraint for the estimation of displacement vector fields from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:565–593.
Neumann, H. and Sepp, W. 1999. Recurrent V1–V2 interaction in early visual boundary processing. Biological Cybernetics, 81:425–444.
Nestares, O. and Fleet, D.J. 2001. Detection and tracking of motion boundaries. IEEE Conference on Computer Vision and Pattern Recognition, 2:358–365.
Nicolescu, M. and Medioni, G. 2003. Motion segmentation with accurate boundaries—a tensor voting approach. Proceedings IEEE Conference on Computer Vision and Pattern Recognition, 1:382–389.
Pack, C.C., Gartland, A.J. and Born, R.T. 2004. Integration of contour and terminator signals in visual area MT of alert macaque. Journal of Neuroscience, 24:3268–3280.
Sajda, P. and Baek, K. 2004. Integration of form and motion within a generative model of visual cortex. Neural Networks: Special Issue on Vision and Brain, 17:809–821.
Shevelev, I.A., Lazareva, N.A., Sharaev, G.A., Novikova, R.V. and Tikhomirov, A. S. 1998. Selective and invariant sensitivity to crosses and corners in cat striate neurons, Neuroscience, 84(3):713–721.
Shimojo, S., Silverman, G. and Nakayama, K. 1989. Occlusion and the solution to the aperture problem for motion. Vision Research, 29:619–626.
Simoncelli, E.P. and Heeger, D.J. 1998. A model of neuronal responses in visual area MT. Vision Research, 38:743–761.
Sporns, O., Gally, J.A., Reeke, G.N. and Edelman, G.M. 1989. Reentrant Signaling among Simulated Neuronal Groups Leads to Coherency in their Oscillatory Activity. PNAS 86: 7265–7269.
Stoner, G.R., Albright, T.D. and Ramachandran, V.S. 1990. Transparency and coherence in human motion perception. Nature, 344:153–155.
Van Essen, D.C. and Galant, J.L. 1994. Neural mechanisms of form and motion processing in the primate visual system. Neuron, 13:1–10.
Weber, J. and Malik, J. 1995. Robust computation of optical flow in a multi-scale differential framework. Int. J. Computer Vision, 14:67–81.
Weiss, Y. 1997. Motion Segmentation using EM—a short tutorial. Online-document: http://www-bcs.mit.edu:16080/people/yweiss/emTutorial.pdf
Weiss, Y. and Adelson, E.H. 1994. Perceptually organized EM: A framework for motion segmentation that combines information about form and motion. MIT Media Lab Perceptual Computing Section TR #315.
Weiss Y. and Adelson E.H. 1996. A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models. Proceedings of IEEE conference on Computer Vision and Pattern Recognition. 321–326.
Weiss Y., Simoncelli E.P. and Adelson E.H. 2002. Motion Illusions as Optimal Percepts. Nature Neuroscience, 5(6): 598–604.
Zetzsche, C. and Barth, E. 1990. Fundamental limits of linear filters in the visual processing of two dimensional signals.Vision Research, 30(7):1111–1117.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bayerl, P., Neumann, H. Disambiguating Visual Motion by Form-Motion Interaction—a Computational Model. Int J Comput Vision 72, 27–45 (2007). https://doi.org/10.1007/s11263-006-8891-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-8891-8