# Invariant Recognition Predicts Tuning of Neurons in Sensory Cortex

## Abstract

Tuning properties of simple cells in cortical V1 can be described in terms of a “universal shape” characterized quantitatively by parameter values which hold across different species (Jones and Palmer 1987; Ringach 2002; Niell and Stryker 2008). This puzzling set of findings begs for a general explanation grounded on an evolutionarily important computational function of the visual cortex. We show here that these properties are quantitatively predicted by the hypothesis that the goal of the ventral stream is to compute for each image a “signature” vector which is invariant to geometric transformations (Anselmi et al. 2013b). The mechanism for continuously learning and maintaining invariance may be the memory storage of a sequence of neural images of a few (arbitrary) objects via Hebbian synapses, while undergoing transformations such as translation, scale changes and rotation. For V1 simple cells this hypothesis implies that the tuning of neurons converges to the eigenvectors of the covariance of their input. Starting with a set of dendritic fields spanning a range of sizes, we show with simulations suggested by a direct analysis, that the solution of the associated “cortical equation” effectively provides a set of Gabor-like shapes with parameter values that quantitatively agree with the physiology data. The same theory provides predictions about the tuning of cells in V4 and in the face patch AL (Leibo et al. 2013a) which are in qualitative agreement with physiology data.

## Keywords

Visual Experience Independent Component Analysis Simple Cell Gabor Wavelet Deep Neural Network## Notes

### Acknowledgments

This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF 1231216.

## References

- Abdel-Hamid O, Mohamed A, Jiang H, Penn G (2012) Applying convolutional neural networks concepts to hybrid nn-hmm model for speech recognition. In: 2012 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 4277–4280. IEEEGoogle Scholar
- Anselmi F, Leibo JZ, Mutch J, Rosasco L, Tacchetti A, Poggio T (2013a) Part I: computation of invariant representations in visual cortex and in deep convolutional architectures. In preparationGoogle Scholar
- Anselmi F, Leibo JZ, Rosasco L, Mutch J, Tacchetti A, Poggio T (2013b) Unsupervised learning of invariant representations in hierarchical architectures. Theoret Comput Sci. CBMM Memo n 1, in press. arXiv:1311.4158
- Anselmi F, Poggio T (2010) Representation learning in sensory cortex: a theory. CBMM memo n 26Google Scholar
- Bell A, Sejnowski T (1997) The independent components of natural scenes are edge filters. Vis Res 3327–3338Google Scholar
- Boyd J (1984) Asymptotic coefficients of hermite function series. J Comput Phys 54:382–410MathSciNetCrossRefMATHGoogle Scholar
- Croner L, Kaplan E (1995) Receptive fields of p and m ganglion cells across the primate retina. Vis Res 35(1):7–24CrossRefGoogle Scholar
- Dan Y, Atick JJ, Reid RC (1996) Effcient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory. J Neurosci 16:3351–3362Google Scholar
- Földiák P (1991) Learning invariance from transformation sequences. Neural Comput 3(2):194–200CrossRefGoogle Scholar
- Freiwald W, Tsao D (2010) Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science 330(6005):845CrossRefGoogle Scholar
- Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202CrossRefMATHGoogle Scholar
- Gallant J, Connor C, Rakshit S, Lewis J, Van Essen D (1996) Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. J Neurophysiol 76:2718–2739Google Scholar
- Hebb DO (1949) The organization of behaviour: a neuropsychological theory. WileyGoogle Scholar
- Hyvrinen A, Oja E (1998) Independent component analysis by general non-linear hebbian-like learning rules. Signal Proces 64:301–313CrossRefMATHGoogle Scholar
- Jones JP, Palmer LA (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258Google Scholar
- Kay K, Naselaris T, Prenger R, Gallant J (2008) Identifying natural images from human brain activity. Nature 452(7185):352–355CrossRefGoogle Scholar
- Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25Google Scholar
- Le QV, Monga R, Devin M, Corrado G, Chen K, Ranzato M, Dean J, Ng AY (2011) Building high-level features using large scale unsupervised learning. CoRR. arXiv:1112.6209
- LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551CrossRefGoogle Scholar
- LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, pp 255–258Google Scholar
- Leibo JZ, Anselmi F, Mutch J, Ebihara AF, Freiwald WA, Poggio T (2013a) View-invariance and mirror-symmetric tuning in a model of the macaque face-processing system. Comput Syst Neurosci I–54. Salt Lake City, USAGoogle Scholar
- Leibo JZ, Anselmi F, Mutch J, Ebihara AF, Freiwald WA, Poggio T (2013b) View-invariance and mirror-symmetric tuning in a model of the macaque face-processing system. Comput Syst Neurosci (COSYNE)Google Scholar
- Li N, DiCarlo JJ (2008) Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science 321(5895):1502–1507Google Scholar
- Mallat S (2012) Group invariant scattering. Commun Pure Appl Math 65(10):1331–1398MathSciNetCrossRefMATHGoogle Scholar
- Meister M, Wong R, Baylor DA, Shatz CJ et al (1991) Synchronous bursts of action potentials in ganglion cells of the developing mammalian retina. Science 252(5008):939–943CrossRefGoogle Scholar
- Mel BW (1997) SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput 9(4):777–804CrossRefGoogle Scholar
- Müller-Kirsten HJW (2012) Introduction to quantum mechanics: Schrödinger equation and path integral, 2nd edn. World Scientific, SingaporeCrossRefMATHGoogle Scholar
- Mutch J, Lowe D (2008) Object class recognition and localization using sparse features with limited receptive fields. Int J Comput Vis 80(1):45–57CrossRefGoogle Scholar
- Niell C, Stryker M (2008) Highly selective receptive fields in mouse visual cortex. J Neurosci 28(30):7520–7536CrossRefGoogle Scholar
- Oja E (1982) Simplified neuron model as a principal component analyzer. J Math Biol 15(3):267–273MathSciNetCrossRefMATHGoogle Scholar
- Oja E (1992) Principal components, minor components, and linear neural networks. Neural Netw 5(6):927–935CrossRefGoogle Scholar
- Olshausen BA, Cadieu CF, Warland D (2009) Learning real and complex overcomplete representations from the statistics of natural images. In: Goyal VK, Papadakis M, van de Ville D (eds) SPIE Proceedings, vol. 7446: Wavelets XIIIGoogle Scholar
- Olshausen B et al (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609CrossRefGoogle Scholar
- Perona P (1991) Deformable kernels for early vision. IEEE Trans Pattern Anal Mach Intell 17:488–499CrossRefGoogle Scholar
- Perrett D, Oram M (1993) Neurophysiology of shape processing. Image Vis Comput 11(6):317–333CrossRefGoogle Scholar
- Pinto N, DiCarlo JJ, Cox D (2009) How far can you get with a modern face recognition test set using only simple features? In: CVPR 2009. IEEE Conference on computer vision and pattern recognition, 2009. IEEE, pp 2591–2598Google Scholar
- Poggio T, Edelman S (1990) A network that learns to recognize three-dimensional objects. Nature 343(6255):263–266CrossRefGoogle Scholar
- Poggio T, Mutch J, Anselmi F, Leibo JZ, Rosasco L, Tacchetti A (2011) Invariances determine the hierarchical architecture and the tuning properties of the ventral stream. Technical report available online, MIT CBCL, 2013. Previously released as MIT-CSAIL-TR-2012-035, 2012 and in Nature Precedings, 2011Google Scholar
- Poggio T, Mutch J, Anselmi F, Leibo JZ, Rosasco L, Tacchetti A (2012) The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). Technical report MIT-CSAIL-TR-2012-035, MIT Computer Science and Artificial Intelligence Laboratory, 2012. Previously released in Nature Precedings, 2011Google Scholar
- Poggio T, Mutch J, Isik L (2014) Computational role of eccentricity dependent cortical magnification. CBMM Memo No. 017. CBMM Funded. arXiv:1406.1770v1
- Rehn M, Sommer FT (2007) A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields. J Comput Neurosci 22(2):135–146Google Scholar
- Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nature Neurosci. 2(11):1019–1025CrossRefGoogle Scholar
- Ringach D (2002) Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. J Neurophysiol 88(1):455–463Google Scholar
- Saxe AM, Bhand M, Mudur R, Suresh B, Ng AY (2011) Unsupervised learning models of primary cortical receptive fields and receptive field plasticity. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K (eds) Advances in neural information processing systems, vol 24, pp 1971–1979Google Scholar
- Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29(3):411–426CrossRefGoogle Scholar
- Stevens CF (2004) Preserving properties of object shape by computations in primary visual cortex. PNAS 101(11):15524–15529CrossRefGoogle Scholar
- Stringer S, Rolls E (2002) Invariant object recognition in the visual system with novel views of 3D objects. Neural Comput 14(11):2585–2596CrossRefMATHGoogle Scholar
- Torralba A, Oliva A (2003) Statistics of natural image categories. In: Network: computation in neural systems, pp 391–412Google Scholar
- Turrigiano GG, Nelson SB (2004) Homeostatic plasticity in the developing nervous system. Nature Rev Neurosci 5(2):97–107CrossRefGoogle Scholar
- Wong R, Meister M, Shatz C (1993) Transient period of correlated bursting activity during development of the mammalian retina. Neuron 11(5):923–938CrossRefGoogle Scholar
- Zylberberg J, Murphy JT, DeWeese MR (2011) A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of v1 simple cell receptive fields. PLoS Comput Biol, 7(10):135–146Google Scholar