Abstract
One of the major goals in visual neuroscience is to understand how the cortex processes visual information (Marr 1982). A substantial effort has thus gone into characterizing input-output relationships across areas of the visual cortex (Dicarlo et al. 2012), which has yielded an array of computational models. These models have, however, typically focused on one or very few visual areas, modules (form, motion, depth, color) or functions (e.g., object recognition, boundary detection, action recognition, etc.), see (Poggio and Serre 2013) for a recent review. An integrated framework that would explain the computational mechanisms underlying vision beyond any specific visual area, module or function, while being at least consistent with the known anatomy and physiology of the visual cortex is still lacking. The goal of this review is to draft an initial integrated theory of visual processing in the cortex. We highlight the computational mechanisms that are shared across many successful models and derive a taxonomy of canonical computations. Such an enterprise is reductionist in nature as we break down the myriad of input-output functions found in the visual cortex into a basic set of computations. Identifying canonical computations that are repeated and combined across visual functions will pave the way for the identification of their cortical substrate (Carandini 2012).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The “aperture problem” reflects the inherent ambiguity associated with the direction of motion of a moving stimulus within the receptive fields (a small aperture) of neurons in early visual areas. Because of its limited receptive field, a motion-selective neuron will often produce identical responses for stimuli that vary greatly in their shape, speed and orientation.
- 2.
A full model would also include eye dominance.
References
Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am A 2(2):284–299
Alonso JM, Martinez LM (1998) Functional connectivity between simple cells and complex cells in cat striate cortex. Nat Neurosci 1(5):395–403
Bradley D, Goyal M (2008) Velocity computation in the primate visual system. Nat Rev Neurosci 9(9):686–695
Cadieu CF, Hong H, Yamins DLK, Pinto N, Ardila D, Solomon EA, Majaj NJ, DiCarlo JJ (2014) Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput Biol 10(12):e1003963
Carandini M (2012) From circuits to behavior: a bridge too far? Nat Neurosci 15(4):507–509
Carandini M, Heeger D (2012) Normalization as a canonical neural computation. Nat Rev Neurosci 13:51–62
Chen X, Han F, Poo M-MM, Dan Y (2007) Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proc Natl Acad Sci USA 104(48):19120–19125
Conway BR (2001) Spatial structure of cone inputs to color cells in alert macaque primary visual cortex (V-1). J Neurosci 21(8):2768–2783
Crick F (1984) Function of the thalamic reticular complex: the searchlight hypothesis. Proc Natl Acad Sci USA 81:4586–4590
Crouzet SM, Serre T (2011) What are the visual features underlying rapid object recognition? Front Psychol 2:326
Daugman JG (1980) Two-dimensional spectral analysis of cortical receptive field profile. Vis Res 20:847–856
Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimization by two-dimensional visual cortical filters. J Opt Soc Am A 2(7):1160–1169
Dayan P, Abbott LF (2001) Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT Press
Desimone R, Albright TD, Gross CG, Bruce C (1984) Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci 4(8):2051–2062
DeValois RL, Albrecht DG, Thorell LG (1982) Spatial-frequency selectivity of cells in macaque visual cortex. Vis Res 22:545–559
Dicarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition ? Neuron 73(3):415–434
Douglas RJ, Martin KAC (2007) Mapping the matrix: the ways of neocortex. Neuron 56(2):226–238
Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Felleman DJ, Van Essen DC (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:1–47
Fenno L, Yizhar O, Deisseroth K (2011) The development and application of optogenetics. Annu Rev Neurosci 34:389–412
Finn I, Ferster D (2007) Computational diversity in complex cells of cat primary visual cortex. J Neurosci 27(36):9638–9648
Fleet DJ, Wagner H, Heeger DJ (1996) Neural encoding of binocular disparity: energy models, positionshifts and phase shifts. Vis Res 36(12):1839–1857
Fleuret F, Li T, Dubout C, Wampler EK, Yantis S, Geman D (2011) Comparing machines and humans on a visual categorization test. Proc Natl Acad Sci USA 108(43):17621–17625
Fries P (2005) A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends Cogn Sci 9(10):474–480
Fukushima K (1983) Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybern B Cybern 13:826–834
Geman D, Koloydenko A (1999) Invariant statistics and coding of natural microimages. Proc IEEE Work Stat Comput Theor Vis
Ghodrati M, Farzmahdi A, Rajaei K, Ebrahimpour R, Khaligh-Razavi S-M (2014) Feedforward object-vision models only tolerate small image variations compared to human. Front Comput Neurosci 8:74
Ghodrati M, Khaligh-Razavi S-M, Ebrahimpour R, Rajaei K, Pooyan M (2012) How can selection of biologically inspired features improve the performance of a robust object recognition model? PLoS One 7(2):e32357
Giese MA, Poggio T (2003) Neural mechanisms for the recognition of biological movements. Nat Rev Neurosci 4(3):179–192
Guclu U, van Gerven MAJ (2015) Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J Neurosci 35(27):10005–10014
Gülçehre C, Bengio Y (2013) Knowledge matters: importance of prior information for optimization. arXiv:1301.4083v6
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification
Heeger DJ (1992) Normalization of cell responses in cat striate cortex. Vis Neurosci 9(2):181–197
Hegdé J, Essen DV (2007) A comparative study of shape representation in macaque visual areas V2 and V4. Cereb Cortex 2(May)
Hegdé J, Felleman DJ (2007) Reappraising the functional implications of the primate visual anatomical hierarchy. Neuroscience 13(5):416–421
Hubel D, Wiesel T (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154
Humanski RA, Wilson HR (1993) Spatial-frequency adaptation: evidence for a multiple-channel model of short-wavelength-sensitive-cone spatial vision. Vis Res 33(5–6):665–675
Jhuang H, Garrote E, Yu X, Khilnani V, Poggio T, Steele AD, Serre T (2010) Automated home-cage behavioural phenotyping of mice. Nat Commun 1(6):1–9
Jhuang H, Serre T, Wolf L, Poggio T (2007) A biologically inspired system for action recognition. In: 2007 IEEE 11th International Conference Computer Vision, pp 1–8
Johnson EN, Hawken MJ, Shapley R (2001) The spatial transformation of color in the primary visual cortex of the macaque monkey. Nat Neurosci 4(4):409–416
Johnson EN, Hawken MJ, Shapley R (2004) Cone inputs in macaque primary visual cortex. J Neurophysiol 91(6):2501–2514
Johnson EN, Hawken MJ, Shapley R (2008) The orientation selectivity of color-responsive neurons in macaque V1. J Neurosci 28(32):8096–8106
Jones JP, Palmer LA (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258
Khaligh-Razavi S-M, Kriegeskorte N (2014) Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput Biol 10(11):e1003915
Kheradpisheh SR, Ganjtabesh M, Masquelier T (2015) Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition. arXiv:1504.03871v3
Kim J, Mely DA, Serre T (2015) A critical evaluation of computational mechanisms of binocular disparity
Kouh M, Poggio T (2008) A canonical neural circuit for cortical nonlinear operations. Neural Comput 20(6):1427–1451
Landy MS, Movshon JA (1991) Computational models of visual processing. MIT Press
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li N, DiCarlo JJ (2008) Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science (80) 321(5895):1502–1507
Li N, DiCarlo JJ (2010) Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67(6):1062–1075
Li N, Dicarlo JJ (2012) Neuronal learning of invariant object representation in the ventral visual stream is not dependent on reward. J Neurosci 32(19):6611–6620
Malik J, Perona P (1990) Preattentive texture discrimination with early vision mechanisms. J Opt Soc Am A 7(5):923–932
Marcelja S (1980) Mathematical description of the responses of simple cortical cells. J Opt Soc Am 70:1297–1300
Marko H, Giebel H (1970) Recognition of handwritten characters with a system of homogeneous layers. Nachrichtentechnische Z 23:455–459
Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H.Freeman & Co Ltd, San Francisco
Masquelier T, Serre T, Poggio T (2007) Learning complex cell invariance from natural videos: a plausibility proof. Technical report, Massachusetts Institute of Technology, Cambridge MA
Masquelier T, Thorpe SJ (2007) Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput Biol 3(2):e31
Maunsell JH, Essen DCV (1983) Functional properties of neurons in middle temporal visual area ofthe macaque monkey. II. Binocular interactions and sensitivity tobinocular disparity
Mel BW (1997) SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput. 9(4):777–804
Mineault P, Khawaja F, Butts D, Pack C (2012) Hierarchical processing of complex motion along the primate dorsal visual pathway. Proc Natl Acad Sci 109(16):E972–E980
Movshon JA, Adelson EH, Gizzi MS, Newsome WT (1985) The analysis of moving visual patterns. Pattern Recogn Mech
Mullen KT, Losada MA (1999) The spatial tuning of color and luminance peripheral vision measured with notch filtered noise masking. Vis Res 39(4):721–731
Nishimoto S, Gallant JL (2011) A three-dimensional spatiotemporal receptive field model explains responses of area MT neurons to naturalistic movies. J Neurosci 31(41):14551–14564
Ohzawa I (1998) Mechanisms of stereoscopic vision: the disparity energy model. Curr Opin Neurobiol 8(4):509–515
Ohzawa I, DeAngelis G, Feeman R (1997) Encoding of binocular disparity by complex cells in the cat’s visual cortex. J Neurophysiol 77(6):2879–2909
Ohzawa I, DeAngelis GC, Freeman RD (1996) Encoding of binocular disparity by simple cells in the cat’s visual cortex. J Neurophysiol 75(5):1779–1805
OReilly RC, Wyatte D, Herd S, Mingus B, Jilk DJ (2013) Recurrent processing during object recognition. Front Psychol 4(April):1–14
Ostojic S, Brunel N (2011) From spiking neuron models to linear-nonlinear models. PLoS Comput Biol 7(1):e1001056
Perrone JA, Thiele A (2001) Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nat Neurosci 4(5):526–532
Plaza SM, Scheffer LK, Chklovskii DB (2014) Toward large-scale connectome reconstructions. Curr Opin Neurobiol 25:201–210
Poggio T, Serre T (2013) Models of the visual cortex. Scholarpedia 8(4):3516
Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Not Am Math Soc 50(5)
Priebe NJ, Cassanello CR, Lisberger SG (2003) The neural representation of speed in macaque area MT/V5. J Neurosci 23(13):5650–5661
Qian N (1994) Computing stereo disparity and motion with known binocular cell properties. Neural Comput 6(3):390–404
Read JC, Cumming BG (2007) Sensors for impossible stimuli may solve the stereo correspondence problem. Nat Neurosci 10(10):1322–1328
Reichert DP, Serre T (2014) Neuronal synchrony in complex-valued deep networks. In: International Conference on Learning Vision Representations
Rieke F, Warland D, van Steveninck R, Bialek W, van Steveninck R (1997) Spikes. The MIT Press, Cambridge, Massachusetts
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025
Riesenhuber M, Poggio T (2000) Models of object recognition. Nat Neurosci 3:1199–1204
Ringach DL (2004) Haphazard wiring of simple receptive fields and orientation columns in visual cortex. J Neurophysiol 92:468–476
Ringach DL (2004) Mapping receptive fields in primary visual cortex. J Physiol 558(3):717–728
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. arXiv:1409.0575v3
Rust NC, Mante V, Simoncelli EP, Movshon JA (2006) How MT cells analyze the motion of visual patterns. Nat Neurosci 9(11):1421–1431
Rust NC, Schwartz O, Movshon JA, Simoncelli EP (2005) Spatiotemporal elements of macaque V1 receptive fields. Neuron 46(6):945–956
Sanada TM, Ohzawa I (2006) Encoding of three-dimensional surface slant in cat visual areas 17 and 18. J Neurophysiol 95(5):2768–2786
Sasaki K, Ohzawa I (2007) Internal spatial organization of receptive fields of complex cells in the early visual cortex. J Neurophysiol 98(3):1194–1212
Series P, Lorenceau J, Frégnac Y (2003) The silent surround of V1 receptive fields: theory and experiments. J Physiol 97:453–474
Serre, T (2014) Hierarchical models of the visual system
Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165:33
Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165(06):33–56
Serre T, Oliva A, Poggio T (2007) A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci USA 104(15):6424–6429
Serre T, Poggio T (2010) A neuromorphic approach to computer vision. Commun ACM 53(10):54
Shapley R, Hawken MJ (2011) Color in the cortex: single- and double-opponent cells. Vis Res 51:701–717
Simoncelli EP, Heeger DJ (1998) A model of neuronal responses in visual area MT. Vision Res 38(5):743–761
Singer W, Gray CM (1995) Visual feature integration and the temporal correlation hypothesis. Ann Rev Neurosci 18:555–586
Stanley GB (2013) Reading and writing the neural code. Nat Neurosci 16(3):259–263
Stevenson IH, Kording KP (2011) How advances in neural recording affect data analysis. Nat Neurosci 14(2):139–142
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:1409.4842v1
Uhlhaas PJ, Pipa G, Lima B, Melloni L, Neuenschwander S, Nikolić D, Singer W (2009) Neural synchrony in cortical networks: history, concept and current status. Front Integr Neurosci 3:17
Ullman S (2007) Object recognition and segmentation by a fragment-based hierarchy. Trends Cogn Sci 11(2):58–64
von der Malsburg C (1994) The correlation theory of brain function. In: Domany E (ed) Models of neural networks II, pp 94–119. Springer
Wallis G, Rolls ET (1997) A model of invariant recognition in the visual system. Prog Neurobiol 51:167–194
Watson AB (1987) Efficiency of a model human image code. J Opt Soc Am A. 4(12):2401–2417
Watson AB, Ahumada AJ (1985) Model of human visual-motion sensing. J Opt Soc Am A 2(2):322–341
Watson AB, Barlow HB, Robson JG (1983) What does the eye see best? Nature 302(5907):419–422
Wersing H, Koerner E (2003) Learning optimized features for hierarchical models of invariant recognition. Neural Comput 15(7):1559–1588
Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ (2014) Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci USA 111(23):8619–8624
Young RA (1987) The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spat Vis 2(4):273–293
Young RA, Lesperance RM (2001) The Gaussian derivative model for spatial-temporal vision: II. Cortical data. Spat Vis 14(3):321–389
Zhang J, Barhomi Y, Serre T (2012) A new biologically inspired color image descriptor. In: European Conference on Computer Vision. LNCS, vol 7576, pp 312–324
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Mély, D.A., Serre, T. (2017). Towards a Theory of Computation in the Visual Cortex. In: Zhao, Q. (eds) Computational and Cognitive Neuroscience of Vision. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-0213-7_4
Download citation
DOI: https://doi.org/10.1007/978-981-10-0213-7_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0211-3
Online ISBN: 978-981-10-0213-7
eBook Packages: EngineeringEngineering (R0)