Towards a Theory of Computation in the Visual Cortex

Mély, David A.; Serre, Thomas

doi:10.1007/978-981-10-0213-7_4

David A. Mély³ &
Thomas Serre³

Part of the book series: Cognitive Science and Technology ((CSAT))

2080 Accesses
9 Citations

Abstract

One of the major goals in visual neuroscience is to understand how the cortex processes visual information (Marr 1982). A substantial effort has thus gone into characterizing input-output relationships across areas of the visual cortex (Dicarlo et al. 2012), which has yielded an array of computational models. These models have, however, typically focused on one or very few visual areas, modules (form, motion, depth, color) or functions (e.g., object recognition, boundary detection, action recognition, etc.), see (Poggio and Serre 2013) for a recent review. An integrated framework that would explain the computational mechanisms underlying vision beyond any specific visual area, module or function, while being at least consistent with the known anatomy and physiology of the visual cortex is still lacking. The goal of this review is to draft an initial integrated theory of visual processing in the cortex. We highlight the computational mechanisms that are shared across many successful models and derive a taxonomy of canonical computations. Such an enterprise is reductionist in nature as we break down the myriad of input-output functions found in the visual cortex into a basic set of computations. Identifying canonical computations that are repeated and combined across visual functions will pave the way for the identification of their cortical substrate (Carandini 2012).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The “aperture problem” reflects the inherent ambiguity associated with the direction of motion of a moving stimulus within the receptive fields (a small aperture) of neurons in early visual areas. Because of its limited receptive field, a motion-selective neuron will often produce identical responses for stimuli that vary greatly in their shape, speed and orientation.
2.
A full model would also include eye dominance.

References

Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am A 2(2):284–299
Article Google Scholar
Alonso JM, Martinez LM (1998) Functional connectivity between simple cells and complex cells in cat striate cortex. Nat Neurosci 1(5):395–403
Article Google Scholar
Bradley D, Goyal M (2008) Velocity computation in the primate visual system. Nat Rev Neurosci 9(9):686–695
Article Google Scholar
Cadieu CF, Hong H, Yamins DLK, Pinto N, Ardila D, Solomon EA, Majaj NJ, DiCarlo JJ (2014) Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput Biol 10(12):e1003963
Article Google Scholar
Carandini M (2012) From circuits to behavior: a bridge too far? Nat Neurosci 15(4):507–509
Article Google Scholar
Carandini M, Heeger D (2012) Normalization as a canonical neural computation. Nat Rev Neurosci 13:51–62
Article Google Scholar
Chen X, Han F, Poo M-MM, Dan Y (2007) Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proc Natl Acad Sci USA 104(48):19120–19125
Article Google Scholar
Conway BR (2001) Spatial structure of cone inputs to color cells in alert macaque primary visual cortex (V-1). J Neurosci 21(8):2768–2783
Google Scholar
Crick F (1984) Function of the thalamic reticular complex: the searchlight hypothesis. Proc Natl Acad Sci USA 81:4586–4590
Article Google Scholar
Crouzet SM, Serre T (2011) What are the visual features underlying rapid object recognition? Front Psychol 2:326
Google Scholar
Daugman JG (1980) Two-dimensional spectral analysis of cortical receptive field profile. Vis Res 20:847–856
Article Google Scholar
Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimization by two-dimensional visual cortical filters. J Opt Soc Am A 2(7):1160–1169
Article Google Scholar
Dayan P, Abbott LF (2001) Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT Press
Google Scholar
Desimone R, Albright TD, Gross CG, Bruce C (1984) Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci 4(8):2051–2062
Google Scholar
DeValois RL, Albrecht DG, Thorell LG (1982) Spatial-frequency selectivity of cells in macaque visual cortex. Vis Res 22:545–559
Article Google Scholar
Dicarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition ? Neuron 73(3):415–434
Article Google Scholar
Douglas RJ, Martin KAC (2007) Mapping the matrix: the ways of neocortex. Neuron 56(2):226–238
Article Google Scholar
Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Felleman DJ, Van Essen DC (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:1–47
Article Google Scholar
Fenno L, Yizhar O, Deisseroth K (2011) The development and application of optogenetics. Annu Rev Neurosci 34:389–412
Article Google Scholar
Finn I, Ferster D (2007) Computational diversity in complex cells of cat primary visual cortex. J Neurosci 27(36):9638–9648
Article Google Scholar
Fleet DJ, Wagner H, Heeger DJ (1996) Neural encoding of binocular disparity: energy models, positionshifts and phase shifts. Vis Res 36(12):1839–1857
Article Google Scholar
Fleuret F, Li T, Dubout C, Wampler EK, Yantis S, Geman D (2011) Comparing machines and humans on a visual categorization test. Proc Natl Acad Sci USA 108(43):17621–17625
Article Google Scholar
Fries P (2005) A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends Cogn Sci 9(10):474–480
Article Google Scholar
Fukushima K (1983) Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybern B Cybern 13:826–834
Google Scholar
Geman D, Koloydenko A (1999) Invariant statistics and coding of natural microimages. Proc IEEE Work Stat Comput Theor Vis
Google Scholar
Ghodrati M, Farzmahdi A, Rajaei K, Ebrahimpour R, Khaligh-Razavi S-M (2014) Feedforward object-vision models only tolerate small image variations compared to human. Front Comput Neurosci 8:74
Google Scholar
Ghodrati M, Khaligh-Razavi S-M, Ebrahimpour R, Rajaei K, Pooyan M (2012) How can selection of biologically inspired features improve the performance of a robust object recognition model? PLoS One 7(2):e32357
Article Google Scholar
Giese MA, Poggio T (2003) Neural mechanisms for the recognition of biological movements. Nat Rev Neurosci 4(3):179–192
Article Google Scholar
Guclu U, van Gerven MAJ (2015) Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J Neurosci 35(27):10005–10014
Article Google Scholar
Gülçehre C, Bengio Y (2013) Knowledge matters: importance of prior information for optimization. arXiv:1301.4083v6
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification
Google Scholar
Heeger DJ (1992) Normalization of cell responses in cat striate cortex. Vis Neurosci 9(2):181–197
Article MathSciNet Google Scholar
Hegdé J, Essen DV (2007) A comparative study of shape representation in macaque visual areas V2 and V4. Cereb Cortex 2(May)
Google Scholar
Hegdé J, Felleman DJ (2007) Reappraising the functional implications of the primate visual anatomical hierarchy. Neuroscience 13(5):416–421
Article Google Scholar
Hubel D, Wiesel T (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154
Article Google Scholar
Humanski RA, Wilson HR (1993) Spatial-frequency adaptation: evidence for a multiple-channel model of short-wavelength-sensitive-cone spatial vision. Vis Res 33(5–6):665–675
Article Google Scholar
Jhuang H, Garrote E, Yu X, Khilnani V, Poggio T, Steele AD, Serre T (2010) Automated home-cage behavioural phenotyping of mice. Nat Commun 1(6):1–9
Article Google Scholar
Jhuang H, Serre T, Wolf L, Poggio T (2007) A biologically inspired system for action recognition. In: 2007 IEEE 11th International Conference Computer Vision, pp 1–8
Google Scholar
Johnson EN, Hawken MJ, Shapley R (2001) The spatial transformation of color in the primary visual cortex of the macaque monkey. Nat Neurosci 4(4):409–416
Article Google Scholar
Johnson EN, Hawken MJ, Shapley R (2004) Cone inputs in macaque primary visual cortex. J Neurophysiol 91(6):2501–2514
Article Google Scholar
Johnson EN, Hawken MJ, Shapley R (2008) The orientation selectivity of color-responsive neurons in macaque V1. J Neurosci 28(32):8096–8106
Article Google Scholar
Jones JP, Palmer LA (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258
Google Scholar
Khaligh-Razavi S-M, Kriegeskorte N (2014) Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput Biol 10(11):e1003915
Article Google Scholar
Kheradpisheh SR, Ganjtabesh M, Masquelier T (2015) Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition. arXiv:1504.03871v3
Kim J, Mely DA, Serre T (2015) A critical evaluation of computational mechanisms of binocular disparity
Google Scholar
Kouh M, Poggio T (2008) A canonical neural circuit for cortical nonlinear operations. Neural Comput 20(6):1427–1451
Article MathSciNet MATH Google Scholar
Landy MS, Movshon JA (1991) Computational models of visual processing. MIT Press
Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li N, DiCarlo JJ (2008) Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science (80) 321(5895):1502–1507
Google Scholar
Li N, DiCarlo JJ (2010) Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67(6):1062–1075
Article Google Scholar
Li N, Dicarlo JJ (2012) Neuronal learning of invariant object representation in the ventral visual stream is not dependent on reward. J Neurosci 32(19):6611–6620
Article Google Scholar
Malik J, Perona P (1990) Preattentive texture discrimination with early vision mechanisms. J Opt Soc Am A 7(5):923–932
Article Google Scholar
Marcelja S (1980) Mathematical description of the responses of simple cortical cells. J Opt Soc Am 70:1297–1300
Article MathSciNet Google Scholar
Marko H, Giebel H (1970) Recognition of handwritten characters with a system of homogeneous layers. Nachrichtentechnische Z 23:455–459
Google Scholar
Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H.Freeman & Co Ltd, San Francisco
Google Scholar
Masquelier T, Serre T, Poggio T (2007) Learning complex cell invariance from natural videos: a plausibility proof. Technical report, Massachusetts Institute of Technology, Cambridge MA
Google Scholar
Masquelier T, Thorpe SJ (2007) Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput Biol 3(2):e31
Article Google Scholar
Maunsell JH, Essen DCV (1983) Functional properties of neurons in middle temporal visual area ofthe macaque monkey. II. Binocular interactions and sensitivity tobinocular disparity
Google Scholar
Mel BW (1997) SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput. 9(4):777–804
Article Google Scholar
Mineault P, Khawaja F, Butts D, Pack C (2012) Hierarchical processing of complex motion along the primate dorsal visual pathway. Proc Natl Acad Sci 109(16):E972–E980
Article Google Scholar
Movshon JA, Adelson EH, Gizzi MS, Newsome WT (1985) The analysis of moving visual patterns. Pattern Recogn Mech
Google Scholar
Mullen KT, Losada MA (1999) The spatial tuning of color and luminance peripheral vision measured with notch filtered noise masking. Vis Res 39(4):721–731
Article Google Scholar
Nishimoto S, Gallant JL (2011) A three-dimensional spatiotemporal receptive field model explains responses of area MT neurons to naturalistic movies. J Neurosci 31(41):14551–14564
Article Google Scholar
Ohzawa I (1998) Mechanisms of stereoscopic vision: the disparity energy model. Curr Opin Neurobiol 8(4):509–515
Article Google Scholar
Ohzawa I, DeAngelis G, Feeman R (1997) Encoding of binocular disparity by complex cells in the cat’s visual cortex. J Neurophysiol 77(6):2879–2909
Google Scholar
Ohzawa I, DeAngelis GC, Freeman RD (1996) Encoding of binocular disparity by simple cells in the cat’s visual cortex. J Neurophysiol 75(5):1779–1805
Google Scholar
OReilly RC, Wyatte D, Herd S, Mingus B, Jilk DJ (2013) Recurrent processing during object recognition. Front Psychol 4(April):1–14
Google Scholar
Ostojic S, Brunel N (2011) From spiking neuron models to linear-nonlinear models. PLoS Comput Biol 7(1):e1001056
Article MathSciNet Google Scholar
Perrone JA, Thiele A (2001) Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nat Neurosci 4(5):526–532
Google Scholar
Plaza SM, Scheffer LK, Chklovskii DB (2014) Toward large-scale connectome reconstructions. Curr Opin Neurobiol 25:201–210
Article Google Scholar
Poggio T, Serre T (2013) Models of the visual cortex. Scholarpedia 8(4):3516
Article Google Scholar
Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Not Am Math Soc 50(5)
Google Scholar
Priebe NJ, Cassanello CR, Lisberger SG (2003) The neural representation of speed in macaque area MT/V5. J Neurosci 23(13):5650–5661
Google Scholar
Qian N (1994) Computing stereo disparity and motion with known binocular cell properties. Neural Comput 6(3):390–404
Article Google Scholar
Read JC, Cumming BG (2007) Sensors for impossible stimuli may solve the stereo correspondence problem. Nat Neurosci 10(10):1322–1328
Article Google Scholar
Reichert DP, Serre T (2014) Neuronal synchrony in complex-valued deep networks. In: International Conference on Learning Vision Representations
Google Scholar
Rieke F, Warland D, van Steveninck R, Bialek W, van Steveninck R (1997) Spikes. The MIT Press, Cambridge, Massachusetts
MATH Google Scholar
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025
Article Google Scholar
Riesenhuber M, Poggio T (2000) Models of object recognition. Nat Neurosci 3:1199–1204
Article Google Scholar
Ringach DL (2004) Haphazard wiring of simple receptive fields and orientation columns in visual cortex. J Neurophysiol 92:468–476
Article Google Scholar
Ringach DL (2004) Mapping receptive fields in primary visual cortex. J Physiol 558(3):717–728
Article Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. arXiv:1409.0575v3
Rust NC, Mante V, Simoncelli EP, Movshon JA (2006) How MT cells analyze the motion of visual patterns. Nat Neurosci 9(11):1421–1431
Article Google Scholar
Rust NC, Schwartz O, Movshon JA, Simoncelli EP (2005) Spatiotemporal elements of macaque V1 receptive fields. Neuron 46(6):945–956
Article Google Scholar
Sanada TM, Ohzawa I (2006) Encoding of three-dimensional surface slant in cat visual areas 17 and 18. J Neurophysiol 95(5):2768–2786
Article Google Scholar
Sasaki K, Ohzawa I (2007) Internal spatial organization of receptive fields of complex cells in the early visual cortex. J Neurophysiol 98(3):1194–1212
Article Google Scholar
Series P, Lorenceau J, Frégnac Y (2003) The silent surround of V1 receptive fields: theory and experiments. J Physiol 97:453–474
Google Scholar
Serre, T (2014) Hierarchical models of the visual system
Google Scholar
Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165:33
Article Google Scholar
Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165(06):33–56
Article Google Scholar
Serre T, Oliva A, Poggio T (2007) A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci USA 104(15):6424–6429
Article Google Scholar
Serre T, Poggio T (2010) A neuromorphic approach to computer vision. Commun ACM 53(10):54
Article Google Scholar
Shapley R, Hawken MJ (2011) Color in the cortex: single- and double-opponent cells. Vis Res 51:701–717
Article Google Scholar
Simoncelli EP, Heeger DJ (1998) A model of neuronal responses in visual area MT. Vision Res 38(5):743–761
Article Google Scholar
Singer W, Gray CM (1995) Visual feature integration and the temporal correlation hypothesis. Ann Rev Neurosci 18:555–586
Article Google Scholar
Stanley GB (2013) Reading and writing the neural code. Nat Neurosci 16(3):259–263
Article Google Scholar
Stevenson IH, Kording KP (2011) How advances in neural recording affect data analysis. Nat Neurosci 14(2):139–142
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:1409.4842v1
Uhlhaas PJ, Pipa G, Lima B, Melloni L, Neuenschwander S, Nikolić D, Singer W (2009) Neural synchrony in cortical networks: history, concept and current status. Front Integr Neurosci 3:17
Article Google Scholar
Ullman S (2007) Object recognition and segmentation by a fragment-based hierarchy. Trends Cogn Sci 11(2):58–64
Article Google Scholar
von der Malsburg C (1994) The correlation theory of brain function. In: Domany E (ed) Models of neural networks II, pp 94–119. Springer
Google Scholar
Wallis G, Rolls ET (1997) A model of invariant recognition in the visual system. Prog Neurobiol 51:167–194
Article Google Scholar
Watson AB (1987) Efficiency of a model human image code. J Opt Soc Am A. 4(12):2401–2417
Article Google Scholar
Watson AB, Ahumada AJ (1985) Model of human visual-motion sensing. J Opt Soc Am A 2(2):322–341
Article Google Scholar
Watson AB, Barlow HB, Robson JG (1983) What does the eye see best? Nature 302(5907):419–422
Article Google Scholar
Wersing H, Koerner E (2003) Learning optimized features for hierarchical models of invariant recognition. Neural Comput 15(7):1559–1588
Article MATH Google Scholar
Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ (2014) Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci USA 111(23):8619–8624
Article Google Scholar
Young RA (1987) The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spat Vis 2(4):273–293
Article Google Scholar
Young RA, Lesperance RM (2001) The Gaussian derivative model for spatial-temporal vision: II. Cortical data. Spat Vis 14(3):321–389
Article Google Scholar
Zhang J, Barhomi Y, Serre T (2012) A new biologically inspired color image descriptor. In: European Conference on Computer Vision. LNCS, vol 7576, pp 312–324
Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive, Linguistic & Psychological Sciences Department, Brown Institute for Brain Science, Brown University, Providence, RI, 02912, USA
David A. Mély & Thomas Serre

Authors

David A. Mély
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Serre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Serre .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
Qi Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mély, D.A., Serre, T. (2017). Towards a Theory of Computation in the Visual Cortex. In: Zhao, Q. (eds) Computational and Cognitive Neuroscience of Vision. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-0213-7_4

Download citation

DOI: https://doi.org/10.1007/978-981-10-0213-7_4
Published: 04 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0211-3
Online ISBN: 978-981-10-0213-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics