Skip to main content

Towards a Theory of Computation in the Visual Cortex

  • Chapter
  • First Online:
Computational and Cognitive Neuroscience of Vision

Part of the book series: Cognitive Science and Technology ((CSAT))

Abstract

One of the major goals in visual neuroscience is to understand how the cortex processes visual information (Marr 1982). A substantial effort has thus gone into characterizing input-output relationships across areas of the visual cortex (Dicarlo et al. 2012), which has yielded an array of computational models. These models have, however, typically focused on one or very few visual areas, modules (form, motion, depth, color) or functions (e.g., object recognition, boundary detection, action recognition, etc.), see (Poggio and Serre 2013) for a recent review. An integrated framework that would explain the computational mechanisms underlying vision beyond any specific visual area, module or function, while being at least consistent with the known anatomy and physiology of the visual cortex is still lacking. The goal of this review is to draft an initial integrated theory of visual processing in the cortex. We highlight the computational mechanisms that are shared across many successful models and derive a taxonomy of canonical computations. Such an enterprise is reductionist in nature as we break down the myriad of input-output functions found in the visual cortex into a basic set of computations. Identifying canonical computations that are repeated and combined across visual functions will pave the way for the identification of their cortical substrate (Carandini 2012).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The “aperture problem” reflects the inherent ambiguity associated with the direction of motion of a moving stimulus within the receptive fields (a small aperture) of neurons in early visual areas. Because of its limited receptive field, a motion-selective neuron will often produce identical responses for stimuli that vary greatly in their shape, speed and orientation.

  2. 2.

    A full model would also include eye dominance.

References

  • Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am A 2(2):284–299

    Article  Google Scholar 

  • Alonso JM, Martinez LM (1998) Functional connectivity between simple cells and complex cells in cat striate cortex. Nat Neurosci 1(5):395–403

    Article  Google Scholar 

  • Bradley D, Goyal M (2008) Velocity computation in the primate visual system. Nat Rev Neurosci 9(9):686–695

    Article  Google Scholar 

  • Cadieu CF, Hong H, Yamins DLK, Pinto N, Ardila D, Solomon EA, Majaj NJ, DiCarlo JJ (2014) Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput Biol 10(12):e1003963

    Article  Google Scholar 

  • Carandini M (2012) From circuits to behavior: a bridge too far? Nat Neurosci 15(4):507–509

    Article  Google Scholar 

  • Carandini M, Heeger D (2012) Normalization as a canonical neural computation. Nat Rev Neurosci 13:51–62

    Article  Google Scholar 

  • Chen X, Han F, Poo M-MM, Dan Y (2007) Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1). Proc Natl Acad Sci USA 104(48):19120–19125

    Article  Google Scholar 

  • Conway BR (2001) Spatial structure of cone inputs to color cells in alert macaque primary visual cortex (V-1). J Neurosci 21(8):2768–2783

    Google Scholar 

  • Crick F (1984) Function of the thalamic reticular complex: the searchlight hypothesis. Proc Natl Acad Sci USA 81:4586–4590

    Article  Google Scholar 

  • Crouzet SM, Serre T (2011) What are the visual features underlying rapid object recognition? Front Psychol 2:326

    Google Scholar 

  • Daugman JG (1980) Two-dimensional spectral analysis of cortical receptive field profile. Vis Res 20:847–856

    Article  Google Scholar 

  • Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimization by two-dimensional visual cortical filters. J Opt Soc Am A 2(7):1160–1169

    Article  Google Scholar 

  • Dayan P, Abbott LF (2001) Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT Press

    Google Scholar 

  • Desimone R, Albright TD, Gross CG, Bruce C (1984) Stimulus-selective properties of inferior temporal neurons in the macaque. J Neurosci 4(8):2051–2062

    Google Scholar 

  • DeValois RL, Albrecht DG, Thorell LG (1982) Spatial-frequency selectivity of cells in macaque visual cortex. Vis Res 22:545–559

    Article  Google Scholar 

  • Dicarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition ? Neuron 73(3):415–434

    Article  Google Scholar 

  • Douglas RJ, Martin KAC (2007) Mapping the matrix: the ways of neocortex. Neuron 56(2):226–238

    Article  Google Scholar 

  • Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  • Felleman DJ, Van Essen DC (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb Cortex 1:1–47

    Article  Google Scholar 

  • Fenno L, Yizhar O, Deisseroth K (2011) The development and application of optogenetics. Annu Rev Neurosci 34:389–412

    Article  Google Scholar 

  • Finn I, Ferster D (2007) Computational diversity in complex cells of cat primary visual cortex. J Neurosci 27(36):9638–9648

    Article  Google Scholar 

  • Fleet DJ, Wagner H, Heeger DJ (1996) Neural encoding of binocular disparity: energy models, positionshifts and phase shifts. Vis Res 36(12):1839–1857

    Article  Google Scholar 

  • Fleuret F, Li T, Dubout C, Wampler EK, Yantis S, Geman D (2011) Comparing machines and humans on a visual categorization test. Proc Natl Acad Sci USA 108(43):17621–17625

    Article  Google Scholar 

  • Fries P (2005) A mechanism for cognitive dynamics: neuronal communication through neuronal coherence. Trends Cogn Sci 9(10):474–480

    Article  Google Scholar 

  • Fukushima K (1983) Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybern B Cybern 13:826–834

    Google Scholar 

  • Geman D, Koloydenko A (1999) Invariant statistics and coding of natural microimages. Proc IEEE Work Stat Comput Theor Vis

    Google Scholar 

  • Ghodrati M, Farzmahdi A, Rajaei K, Ebrahimpour R, Khaligh-Razavi S-M (2014) Feedforward object-vision models only tolerate small image variations compared to human. Front Comput Neurosci 8:74

    Google Scholar 

  • Ghodrati M, Khaligh-Razavi S-M, Ebrahimpour R, Rajaei K, Pooyan M (2012) How can selection of biologically inspired features improve the performance of a robust object recognition model? PLoS One 7(2):e32357

    Article  Google Scholar 

  • Giese MA, Poggio T (2003) Neural mechanisms for the recognition of biological movements. Nat Rev Neurosci 4(3):179–192

    Article  Google Scholar 

  • Guclu U, van Gerven MAJ (2015) Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J Neurosci 35(27):10005–10014

    Article  Google Scholar 

  • Gülçehre C, Bengio Y (2013) Knowledge matters: importance of prior information for optimization. arXiv:1301.4083v6

  • He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification

    Google Scholar 

  • Heeger DJ (1992) Normalization of cell responses in cat striate cortex. Vis Neurosci 9(2):181–197

    Article  MathSciNet  Google Scholar 

  • Hegdé J, Essen DV (2007) A comparative study of shape representation in macaque visual areas V2 and V4. Cereb Cortex 2(May)

    Google Scholar 

  • Hegdé J, Felleman DJ (2007) Reappraising the functional implications of the primate visual anatomical hierarchy. Neuroscience 13(5):416–421

    Article  Google Scholar 

  • Hubel D, Wiesel T (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154

    Article  Google Scholar 

  • Humanski RA, Wilson HR (1993) Spatial-frequency adaptation: evidence for a multiple-channel model of short-wavelength-sensitive-cone spatial vision. Vis Res 33(5–6):665–675

    Article  Google Scholar 

  • Jhuang H, Garrote E, Yu X, Khilnani V, Poggio T, Steele AD, Serre T (2010) Automated home-cage behavioural phenotyping of mice. Nat Commun 1(6):1–9

    Article  Google Scholar 

  • Jhuang H, Serre T, Wolf L, Poggio T (2007) A biologically inspired system for action recognition. In: 2007 IEEE 11th International Conference Computer Vision, pp 1–8

    Google Scholar 

  • Johnson EN, Hawken MJ, Shapley R (2001) The spatial transformation of color in the primary visual cortex of the macaque monkey. Nat Neurosci 4(4):409–416

    Article  Google Scholar 

  • Johnson EN, Hawken MJ, Shapley R (2004) Cone inputs in macaque primary visual cortex. J Neurophysiol 91(6):2501–2514

    Article  Google Scholar 

  • Johnson EN, Hawken MJ, Shapley R (2008) The orientation selectivity of color-responsive neurons in macaque V1. J Neurosci 28(32):8096–8106

    Article  Google Scholar 

  • Jones JP, Palmer LA (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258

    Google Scholar 

  • Khaligh-Razavi S-M, Kriegeskorte N (2014) Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput Biol 10(11):e1003915

    Article  Google Scholar 

  • Kheradpisheh SR, Ganjtabesh M, Masquelier T (2015) Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition. arXiv:1504.03871v3

  • Kim J, Mely DA, Serre T (2015) A critical evaluation of computational mechanisms of binocular disparity

    Google Scholar 

  • Kouh M, Poggio T (2008) A canonical neural circuit for cortical nonlinear operations. Neural Comput 20(6):1427–1451

    Article  MathSciNet  MATH  Google Scholar 

  • Landy MS, Movshon JA (1991) Computational models of visual processing. MIT Press

    Google Scholar 

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Li N, DiCarlo JJ (2008) Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science (80) 321(5895):1502–1507

    Google Scholar 

  • Li N, DiCarlo JJ (2010) Unsupervised natural visual experience rapidly reshapes size-invariant object representation in inferior temporal cortex. Neuron 67(6):1062–1075

    Article  Google Scholar 

  • Li N, Dicarlo JJ (2012) Neuronal learning of invariant object representation in the ventral visual stream is not dependent on reward. J Neurosci 32(19):6611–6620

    Article  Google Scholar 

  • Malik J, Perona P (1990) Preattentive texture discrimination with early vision mechanisms. J Opt Soc Am A 7(5):923–932

    Article  Google Scholar 

  • Marcelja S (1980) Mathematical description of the responses of simple cortical cells. J Opt Soc Am 70:1297–1300

    Article  MathSciNet  Google Scholar 

  • Marko H, Giebel H (1970) Recognition of handwritten characters with a system of homogeneous layers. Nachrichtentechnische Z 23:455–459

    Google Scholar 

  • Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H.Freeman & Co Ltd, San Francisco

    Google Scholar 

  • Masquelier T, Serre T, Poggio T (2007) Learning complex cell invariance from natural videos: a plausibility proof. Technical report, Massachusetts Institute of Technology, Cambridge MA

    Google Scholar 

  • Masquelier T, Thorpe SJ (2007) Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput Biol 3(2):e31

    Article  Google Scholar 

  • Maunsell JH, Essen DCV (1983) Functional properties of neurons in middle temporal visual area ofthe macaque monkey. II. Binocular interactions and sensitivity tobinocular disparity

    Google Scholar 

  • Mel BW (1997) SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput. 9(4):777–804

    Article  Google Scholar 

  • Mineault P, Khawaja F, Butts D, Pack C (2012) Hierarchical processing of complex motion along the primate dorsal visual pathway. Proc Natl Acad Sci 109(16):E972–E980

    Article  Google Scholar 

  • Movshon JA, Adelson EH, Gizzi MS, Newsome WT (1985) The analysis of moving visual patterns. Pattern Recogn Mech

    Google Scholar 

  • Mullen KT, Losada MA (1999) The spatial tuning of color and luminance peripheral vision measured with notch filtered noise masking. Vis Res 39(4):721–731

    Article  Google Scholar 

  • Nishimoto S, Gallant JL (2011) A three-dimensional spatiotemporal receptive field model explains responses of area MT neurons to naturalistic movies. J Neurosci 31(41):14551–14564

    Article  Google Scholar 

  • Ohzawa I (1998) Mechanisms of stereoscopic vision: the disparity energy model. Curr Opin Neurobiol 8(4):509–515

    Article  Google Scholar 

  • Ohzawa I, DeAngelis G, Feeman R (1997) Encoding of binocular disparity by complex cells in the cat’s visual cortex. J Neurophysiol 77(6):2879–2909

    Google Scholar 

  • Ohzawa I, DeAngelis GC, Freeman RD (1996) Encoding of binocular disparity by simple cells in the cat’s visual cortex. J Neurophysiol 75(5):1779–1805

    Google Scholar 

  • OReilly RC, Wyatte D, Herd S, Mingus B, Jilk DJ (2013) Recurrent processing during object recognition. Front Psychol 4(April):1–14

    Google Scholar 

  • Ostojic S, Brunel N (2011) From spiking neuron models to linear-nonlinear models. PLoS Comput Biol 7(1):e1001056

    Article  MathSciNet  Google Scholar 

  • Perrone JA, Thiele A (2001) Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nat Neurosci 4(5):526–532

    Google Scholar 

  • Plaza SM, Scheffer LK, Chklovskii DB (2014) Toward large-scale connectome reconstructions. Curr Opin Neurobiol 25:201–210

    Article  Google Scholar 

  • Poggio T, Serre T (2013) Models of the visual cortex. Scholarpedia 8(4):3516

    Article  Google Scholar 

  • Poggio T, Smale S (2003) The mathematics of learning: dealing with data. Not Am Math Soc 50(5)

    Google Scholar 

  • Priebe NJ, Cassanello CR, Lisberger SG (2003) The neural representation of speed in macaque area MT/V5. J Neurosci 23(13):5650–5661

    Google Scholar 

  • Qian N (1994) Computing stereo disparity and motion with known binocular cell properties. Neural Comput 6(3):390–404

    Article  Google Scholar 

  • Read JC, Cumming BG (2007) Sensors for impossible stimuli may solve the stereo correspondence problem. Nat Neurosci 10(10):1322–1328

    Article  Google Scholar 

  • Reichert DP, Serre T (2014) Neuronal synchrony in complex-valued deep networks. In: International Conference on Learning Vision Representations

    Google Scholar 

  • Rieke F, Warland D, van Steveninck R, Bialek W, van Steveninck R (1997) Spikes. The MIT Press, Cambridge, Massachusetts

    MATH  Google Scholar 

  • Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019–1025

    Article  Google Scholar 

  • Riesenhuber M, Poggio T (2000) Models of object recognition. Nat Neurosci 3:1199–1204

    Article  Google Scholar 

  • Ringach DL (2004) Haphazard wiring of simple receptive fields and orientation columns in visual cortex. J Neurophysiol 92:468–476

    Article  Google Scholar 

  • Ringach DL (2004) Mapping receptive fields in primary visual cortex. J Physiol 558(3):717–728

    Article  Google Scholar 

  • Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. arXiv:1409.0575v3

  • Rust NC, Mante V, Simoncelli EP, Movshon JA (2006) How MT cells analyze the motion of visual patterns. Nat Neurosci 9(11):1421–1431

    Article  Google Scholar 

  • Rust NC, Schwartz O, Movshon JA, Simoncelli EP (2005) Spatiotemporal elements of macaque V1 receptive fields. Neuron 46(6):945–956

    Article  Google Scholar 

  • Sanada TM, Ohzawa I (2006) Encoding of three-dimensional surface slant in cat visual areas 17 and 18. J Neurophysiol 95(5):2768–2786

    Article  Google Scholar 

  • Sasaki K, Ohzawa I (2007) Internal spatial organization of receptive fields of complex cells in the early visual cortex. J Neurophysiol 98(3):1194–1212

    Article  Google Scholar 

  • Series P, Lorenceau J, Frégnac Y (2003) The silent surround of V1 receptive fields: theory and experiments. J Physiol 97:453–474

    Google Scholar 

  • Serre, T (2014) Hierarchical models of the visual system

    Google Scholar 

  • Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165:33

    Article  Google Scholar 

  • Serre T, Kreiman G, Kouh M, Cadieu C, Knoblich U, Poggio T (2007) A quantitative theory of immediate visual recognition. Prog Brain Res 165(06):33–56

    Article  Google Scholar 

  • Serre T, Oliva A, Poggio T (2007) A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci USA 104(15):6424–6429

    Article  Google Scholar 

  • Serre T, Poggio T (2010) A neuromorphic approach to computer vision. Commun ACM 53(10):54

    Article  Google Scholar 

  • Shapley R, Hawken MJ (2011) Color in the cortex: single- and double-opponent cells. Vis Res 51:701–717

    Article  Google Scholar 

  • Simoncelli EP, Heeger DJ (1998) A model of neuronal responses in visual area MT. Vision Res 38(5):743–761

    Article  Google Scholar 

  • Singer W, Gray CM (1995) Visual feature integration and the temporal correlation hypothesis. Ann Rev Neurosci 18:555–586

    Article  Google Scholar 

  • Stanley GB (2013) Reading and writing the neural code. Nat Neurosci 16(3):259–263

    Article  Google Scholar 

  • Stevenson IH, Kording KP (2011) How advances in neural recording affect data analysis. Nat Neurosci 14(2):139–142

    Article  Google Scholar 

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:1409.4842v1

  • Uhlhaas PJ, Pipa G, Lima B, Melloni L, Neuenschwander S, Nikolić D, Singer W (2009) Neural synchrony in cortical networks: history, concept and current status. Front Integr Neurosci 3:17

    Article  Google Scholar 

  • Ullman S (2007) Object recognition and segmentation by a fragment-based hierarchy. Trends Cogn Sci 11(2):58–64

    Article  Google Scholar 

  • von der Malsburg C (1994) The correlation theory of brain function. In: Domany E (ed) Models of neural networks II, pp 94–119. Springer

    Google Scholar 

  • Wallis G, Rolls ET (1997) A model of invariant recognition in the visual system. Prog Neurobiol 51:167–194

    Article  Google Scholar 

  • Watson AB (1987) Efficiency of a model human image code. J Opt Soc Am A. 4(12):2401–2417

    Article  Google Scholar 

  • Watson AB, Ahumada AJ (1985) Model of human visual-motion sensing. J Opt Soc Am A 2(2):322–341

    Article  Google Scholar 

  • Watson AB, Barlow HB, Robson JG (1983) What does the eye see best? Nature 302(5907):419–422

    Article  Google Scholar 

  • Wersing H, Koerner E (2003) Learning optimized features for hierarchical models of invariant recognition. Neural Comput 15(7):1559–1588

    Article  MATH  Google Scholar 

  • Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, DiCarlo JJ (2014) Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci USA 111(23):8619–8624

    Article  Google Scholar 

  • Young RA (1987) The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spat Vis 2(4):273–293

    Article  Google Scholar 

  • Young RA, Lesperance RM (2001) The Gaussian derivative model for spatial-temporal vision: II. Cortical data. Spat Vis 14(3):321–389

    Article  Google Scholar 

  • Zhang J, Barhomi Y, Serre T (2012) A new biologically inspired color image descriptor. In: European Conference on Computer Vision. LNCS, vol 7576, pp 312–324

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Serre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Mély, D.A., Serre, T. (2017). Towards a Theory of Computation in the Visual Cortex. In: Zhao, Q. (eds) Computational and Cognitive Neuroscience of Vision. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-0213-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0213-7_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0211-3

  • Online ISBN: 978-981-10-0213-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics