Invariant Recognition Predicts Tuning of Neurons in Sensory Cortex

Mutch, Jim; Anselmi, Fabio; Tacchetti, Andrea; Rosasco, Lorenzo; Leibo, Joel Z.; Poggio, Tomaso

doi:10.1007/978-981-10-0213-7_5

Jim Mutch³,
Fabio Anselmi³,
Andrea Tacchetti³,
Lorenzo Rosasco³,
Joel Z. Leibo³ &
…
Tomaso Poggio³

Part of the book series: Cognitive Science and Technology ((CSAT))

1960 Accesses
2 Citations
3 Altmetric

Abstract

Tuning properties of simple cells in cortical V1 can be described in terms of a “universal shape” characterized quantitatively by parameter values which hold across different species (Jones and Palmer 1987; Ringach 2002; Niell and Stryker 2008). This puzzling set of findings begs for a general explanation grounded on an evolutionarily important computational function of the visual cortex. We show here that these properties are quantitatively predicted by the hypothesis that the goal of the ventral stream is to compute for each image a “signature” vector which is invariant to geometric transformations (Anselmi et al. 2013b). The mechanism for continuously learning and maintaining invariance may be the memory storage of a sequence of neural images of a few (arbitrary) objects via Hebbian synapses, while undergoing transformations such as translation, scale changes and rotation. For V1 simple cells this hypothesis implies that the tuning of neurons converges to the eigenvectors of the covariance of their input. Starting with a set of dendritic fields spanning a range of sizes, we show with simulations suggested by a direct analysis, that the solution of the associated “cortical equation” effectively provides a set of Gabor-like shapes with parameter values that quantitatively agree with the physiology data. The same theory provides predictions about the tuning of cells in V4 and in the face patch AL (Leibo et al. 2013a) which are in qualitative agreement with physiology data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abdel-Hamid O, Mohamed A, Jiang H, Penn G (2012) Applying convolutional neural networks concepts to hybrid nn-hmm model for speech recognition. In: 2012 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 4277–4280. IEEE
Google Scholar
Anselmi F, Leibo JZ, Mutch J, Rosasco L, Tacchetti A, Poggio T (2013a) Part I: computation of invariant representations in visual cortex and in deep convolutional architectures. In preparation
Google Scholar
Anselmi F, Leibo JZ, Rosasco L, Mutch J, Tacchetti A, Poggio T (2013b) Unsupervised learning of invariant representations in hierarchical architectures. Theoret Comput Sci. CBMM Memo n 1, in press. arXiv:1311.4158
Anselmi F, Poggio T (2010) Representation learning in sensory cortex: a theory. CBMM memo n 26
Google Scholar
Bell A, Sejnowski T (1997) The independent components of natural scenes are edge filters. Vis Res 3327–3338
Google Scholar
Boyd J (1984) Asymptotic coefficients of hermite function series. J Comput Phys 54:382–410
Article MathSciNet MATH Google Scholar
Croner L, Kaplan E (1995) Receptive fields of p and m ganglion cells across the primate retina. Vis Res 35(1):7–24
Article Google Scholar
Dan Y, Atick JJ, Reid RC (1996) Effcient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory. J Neurosci 16:3351–3362
Google Scholar
Földiák P (1991) Learning invariance from transformation sequences. Neural Comput 3(2):194–200
Article Google Scholar
Freiwald W, Tsao D (2010) Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science 330(6005):845
Article Google Scholar
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4):193–202
Article MATH Google Scholar
Gallant J, Connor C, Rakshit S, Lewis J, Van Essen D (1996) Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. J Neurophysiol 76:2718–2739
Google Scholar
Hebb DO (1949) The organization of behaviour: a neuropsychological theory. Wiley
Google Scholar
Hyvrinen A, Oja E (1998) Independent component analysis by general non-linear hebbian-like learning rules. Signal Proces 64:301–313
Article MATH Google Scholar
Jones JP, Palmer LA (1987) An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233–1258
Google Scholar
Kay K, Naselaris T, Prenger R, Gallant J (2008) Identifying natural images from human brain activity. Nature 452(7185):352–355
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25
Google Scholar
Le QV, Monga R, Devin M, Corrado G, Chen K, Ranzato M, Dean J, Ng AY (2011) Building high-level features using large scale unsupervised learning. CoRR. arXiv:1112.6209
LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Article Google Scholar
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, pp 255–258
Google Scholar
Leibo JZ, Anselmi F, Mutch J, Ebihara AF, Freiwald WA, Poggio T (2013a) View-invariance and mirror-symmetric tuning in a model of the macaque face-processing system. Comput Syst Neurosci I–54. Salt Lake City, USA
Google Scholar
Leibo JZ, Anselmi F, Mutch J, Ebihara AF, Freiwald WA, Poggio T (2013b) View-invariance and mirror-symmetric tuning in a model of the macaque face-processing system. Comput Syst Neurosci (COSYNE)
Google Scholar
Li N, DiCarlo JJ (2008) Unsupervised natural experience rapidly alters invariant object representation in visual cortex. Science 321(5895):1502–1507
Google Scholar
Mallat S (2012) Group invariant scattering. Commun Pure Appl Math 65(10):1331–1398
Article MathSciNet MATH Google Scholar
Meister M, Wong R, Baylor DA, Shatz CJ et al (1991) Synchronous bursts of action potentials in ganglion cells of the developing mammalian retina. Science 252(5008):939–943
Article Google Scholar
Mel BW (1997) SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural Comput 9(4):777–804
Article Google Scholar
Müller-Kirsten HJW (2012) Introduction to quantum mechanics: Schrödinger equation and path integral, 2nd edn. World Scientific, Singapore
Book MATH Google Scholar
Mutch J, Lowe D (2008) Object class recognition and localization using sparse features with limited receptive fields. Int J Comput Vis 80(1):45–57
Article Google Scholar
Niell C, Stryker M (2008) Highly selective receptive fields in mouse visual cortex. J Neurosci 28(30):7520–7536
Article Google Scholar
Oja E (1982) Simplified neuron model as a principal component analyzer. J Math Biol 15(3):267–273
Article MathSciNet MATH Google Scholar
Oja E (1992) Principal components, minor components, and linear neural networks. Neural Netw 5(6):927–935
Article Google Scholar
Olshausen BA, Cadieu CF, Warland D (2009) Learning real and complex overcomplete representations from the statistics of natural images. In: Goyal VK, Papadakis M, van de Ville D (eds) SPIE Proceedings, vol. 7446: Wavelets XIII
Google Scholar
Olshausen B et al (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609
Article Google Scholar
Perona P (1991) Deformable kernels for early vision. IEEE Trans Pattern Anal Mach Intell 17:488–499
Article Google Scholar
Perrett D, Oram M (1993) Neurophysiology of shape processing. Image Vis Comput 11(6):317–333
Article Google Scholar
Pinto N, DiCarlo JJ, Cox D (2009) How far can you get with a modern face recognition test set using only simple features? In: CVPR 2009. IEEE Conference on computer vision and pattern recognition, 2009. IEEE, pp 2591–2598
Google Scholar
Poggio T, Edelman S (1990) A network that learns to recognize three-dimensional objects. Nature 343(6255):263–266
Article Google Scholar
Poggio T, Mutch J, Anselmi F, Leibo JZ, Rosasco L, Tacchetti A (2011) Invariances determine the hierarchical architecture and the tuning properties of the ventral stream. Technical report available online, MIT CBCL, 2013. Previously released as MIT-CSAIL-TR-2012-035, 2012 and in Nature Precedings, 2011
Google Scholar
Poggio T, Mutch J, Anselmi F, Leibo JZ, Rosasco L, Tacchetti A (2012) The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). Technical report MIT-CSAIL-TR-2012-035, MIT Computer Science and Artificial Intelligence Laboratory, 2012. Previously released in Nature Precedings, 2011
Google Scholar
Poggio T, Mutch J, Isik L (2014) Computational role of eccentricity dependent cortical magnification. CBMM Memo No. 017. CBMM Funded. arXiv:1406.1770v1
Rehn M, Sommer FT (2007) A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields. J Comput Neurosci 22(2):135–146
Google Scholar
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nature Neurosci. 2(11):1019–1025
Article Google Scholar
Ringach D (2002) Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. J Neurophysiol 88(1):455–463
Google Scholar
Saxe AM, Bhand M, Mudur R, Suresh B, Ng AY (2011) Unsupervised learning models of primary cortical receptive fields and receptive field plasticity. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K (eds) Advances in neural information processing systems, vol 24, pp 1971–1979
Google Scholar
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell 29(3):411–426
Article Google Scholar
Stevens CF (2004) Preserving properties of object shape by computations in primary visual cortex. PNAS 101(11):15524–15529
Article Google Scholar
Stringer S, Rolls E (2002) Invariant object recognition in the visual system with novel views of 3D objects. Neural Comput 14(11):2585–2596
Article MATH Google Scholar
Torralba A, Oliva A (2003) Statistics of natural image categories. In: Network: computation in neural systems, pp 391–412
Google Scholar
Turrigiano GG, Nelson SB (2004) Homeostatic plasticity in the developing nervous system. Nature Rev Neurosci 5(2):97–107
Article Google Scholar
Wong R, Meister M, Shatz C (1993) Transient period of correlated bursting activity during development of the mammalian retina. Neuron 11(5):923–938
Article Google Scholar
Zylberberg J, Murphy JT, DeWeese MR (2011) A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of v1 simple cell receptive fields. PLoS Comput Biol, 7(10):135–146
Google Scholar

Download references

Acknowledgments

This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF 1231216.

Author information

Authors and Affiliations

Massachusetts Institute of Technology (MIT), 77 Massachusetts Ave. MIT Bldg 46, Cambridge, MA, 02139, USA
Jim Mutch, Fabio Anselmi, Andrea Tacchetti, Lorenzo Rosasco, Joel Z. Leibo & Tomaso Poggio

Authors

Jim Mutch
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Anselmi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Tacchetti
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Rosasco
View author publications
You can also search for this author in PubMed Google Scholar
Joel Z. Leibo
View author publications
You can also search for this author in PubMed Google Scholar
Tomaso Poggio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Anselmi .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
Qi Zhao

1 Appendix

1.1 1.1 Retinal Processing

Our simulation pipeline consists of several filtering stages steps that mimic retinal processing, followed by a Gaussian mask, as shown in Fig. 4. Values for the DoG filter were those suggested by Croner and Kaplan (1995); the spatial lowpass filter has frequency response: $1/\sqrt{\omega ^{2}_{x}+\omega ^{2}_{y}}$. The temporal derivative is performed using imbalanced weights (−0.95,1) so that the DC components is not zero. Each cells learns by extracting the principal components of a movie generated by a natural image patch undergoing a rigid translation. Each frame goes through the pipeline described here and is then fed to the unsupervised learning module (computing eigenvectors of the covariance). We used 40 natural images and 19 different Gaussian apertures for the simulations presented in this book chapter (Fig. 5).

1.2 1.2 Additional Evidence for Gabor Shapes as Templates in V1

In addition to Jones and Palmer (1987), Niell and Stryker (2008), Ringach (2002), a recent paper (Kay et al. 2008) shows that the assumption of a system of Gabor wavelets in V1 provides a very good fitting of fMRI data. Note that the templates of the theory described in Anselmi et al. (2013a) become during unsupervised learning (because of Hebbian synapses) Gabor-like eigenfunctions, as described here.

1.3 1.3 Hebbian Rule and Gabor-Like Functions

In this section we show how, from the hypothesis that the synaptic weights of a simple cell change according to a Hebbian rule, the tuning properties of the simple cells in V1 converge to Gabor-like functions.

We consider, for simplicity, the 1D case (see also Poggio et al. 2013 for a derivation and properties). The associated eigenproblem is

$$\begin{aligned} \int dx g(y) g(x) \psi _n(x) t^{\circledast }(y-x) = \nu _n \psi _n(y) \end{aligned}$$

(2)

where $t^{\circledast }$ is the autocorrelation function of the template t, g is a gaussian function with fixed $\sigma $, $\nu _{n}$ are the eigenvalues and $\psi _{n}$ are the eigenfunctions (see Poggio et al. 2012; Perona 1991 for solutions in the case where there is no gaussian).

1.3.1 1.3.1 Approximate Anzatz Solution for Piecewise Constant Spectrum

We start representing the template autocorrelation function as the inverse of its Fourier transform:

$$\begin{aligned} t^{\circledast }(x)=\frac{1}{\sqrt{2\pi }}\int d\omega \;t^{\circledast }(\omega )e^{i\omega x}. \end{aligned}$$

(3)

Let $\alpha =1/\sigma ^{2}_{x}$, $\beta =1/\sigma ^{2}_{\psi }$ and assume that the eigenfunctions have the form $\psi _{n}(x) = e^{-\frac{\beta }{2}x^2}e^{i\omega _{n}x}$, where $\beta $ and $\omega _{n}$ are parameters to be found. Assume also that $g(x)=\exp (-(\alpha /2)x^{2})$. With these assumptions Eq. (2) reads:

$$\begin{aligned} \frac{1}{\sqrt{2\pi }}e^{-\frac{\alpha }{2}y^2}\int dx\; e^{-\frac{x^2 (\alpha +\beta )}{2}}\int d\omega \;t^{\circledast }(\omega )e^{i\omega (y-x)}e^{iw_{n}x}= \nu (\omega _{n})e^{-\frac{\beta y^2}{2}}e^{i\omega _{n} y}. \end{aligned}$$

(4)

Collecting the terms in x and integrating in x we have that the l.h.s becomes:

$$\begin{aligned} \sqrt{\frac{1}{\alpha +\beta }}e^{-\frac{\alpha }{2}y^2}\int d\omega \;t^{\circledast }(\omega )e^{i\omega y}e^{-\frac{(\omega -\omega _{n})^{2}}{2(\alpha +\beta )}}. \end{aligned}$$

(5)

With the variable change $\bar{\omega }= \omega -\omega _{n}$ and in the hypothesis that $t^{\circledast }(\bar{\omega }+\omega _{n})\approx const$ over the significant support of the Gaussian centered in 0, integrating in $\bar{\omega }$ we obtain:

$$\begin{aligned} \sqrt{2\pi } e^{-\frac{y^2 \alpha }{2}} e^{i \omega _{n} y} e^{- \frac{y^2 (\alpha +\beta )}{2}} \sim \nu (\omega _{n}) e^{-\frac{y^2 \beta }{2}} e^{i\omega _{n} y}. \end{aligned}$$

(6)

Notice that this implies an upper bound on $\beta $ since otherwise t would be white noise which is inconsistent with the diffraction-limited optics of the eye. Thus the condition in Eq. (6) holds approximately over the relevant y interval which is between $- \sigma _{\psi }$ and $+ \sigma _{\psi }$ and therefore Gabor functions are an approximate solution of Eq. (2).

We prove now that the orthogonality conditions of the eigenfunctions lead to Gabor wavelets. Consider, e.g., the approximate eigenfunction $\psi _1$ with frequency $ \omega _0$. The minimum value of $\omega _0$ is set by the condition that $\psi _1$ has to be roughly orthogonal to the constant (this assumes that the visual input does have a DC component, which implies that there is no exact derivative stage in the input filtering by the retina).

$$\begin{aligned} \langle \psi _{0},\psi _{1}\rangle =C_{(0,1)}\;\int dx\; e^{-\beta x^2}e^{-i\omega _{0}x}=0 \; \Rightarrow \;e^{-\frac{\omega _{0}^{2}}{4\beta }} \approx 0 \end{aligned}$$

(7)

where $C_{(0,1)}$ is the multiplication of the normalizing factors of the eigenfunctions.

Using $2\pi f_{0}=\frac{2 \pi }{\lambda _0} = \omega _{0}$ the condition above implies $e^{-(\frac{\pi \sigma _{\psi }}{\lambda _{0}})^{2}} \approx 0$ which can be satisfied with $\sigma _{\psi } \ge \lambda _0$; the condition $\sigma _{\psi } \sim \lambda _0$ is enough since it implies $e^{-(\frac{\pi \sigma _{\psi }}{\lambda _0})^2} \approx e^{-\pi ^2}$.

Imposing orthogonality of any pair of eigenfunctions:

$$\begin{aligned} \int dx\; \psi ^{*}_n(x) \psi _m(x) = const(m,n)\int dx e^{-\beta x^2} e^{ i n \omega _0 x} e^{- i m \omega _0 x} \propto e^{- \frac{((m-n) \omega _0)^2 \sigma ^{2}_{\psi }}{4}}, \end{aligned}$$

we have a similar condition to the above. This implies that $\lambda _{n}$ should increase with $\sigma _{\psi }$ of the Gaussian aperture, which is a property of gabor wavelets!, even if this is valid here only for $n=0,1,2$.

1.3.2 1.3.2 Differential Equation Approach

In this section we describe another approach to the analysis of the cortical equation which is somewhat restricted but interesting for the potential connections with classical problems in mathematical physics.

Suppose as in the previous paragraph $g(x)= e^{-\frac{\alpha }{2} x^2}$. The eigenproblem (2) can be written as:

$$\begin{aligned} \nu _{n}\psi _{n}(y)-e^{-\frac{\alpha }{2} y^2}\int \;dx\;e^{-\frac{\alpha }{2}x^2}t^{\circledast }(y-x)\psi _{n}(x) = 0, \end{aligned}$$

(8)

or equivalently, multiplying both sides by $e^{+\frac{\alpha }{2} y^2}$ and defining the function $\xi _{n}(x)=e^{+\frac{\alpha }{2} x^2}\psi _{n}(x)$, as

$$\begin{aligned} \nu _{n}\xi _{n}(y)-\int \;dx\;e^{-\alpha x^{2}}t^{\circledast }(y-x)\xi _{n}(x) = 0. \end{aligned}$$

(9)

Decomposing $t^{\circledast }(x)$ as in Eq. (3) in Eq. (9):

$$ \nu _{n}\xi _{n}(y)-\frac{1}{\sqrt{2\pi }}\int \;dx\;e^{-\alpha x^{2}}\int \;d\omega \;t^{\circledast }(\omega )e^{i\omega (y-x)}\xi _{n}(x) = 0. $$

Deriving twice in the y variable and rearranging the order of the integrals:

$$\begin{aligned} \nu _{n}\xi ^{''}_{n}(y)+\frac{1}{\sqrt{2\pi }}\int \;d\omega \;\omega ^{2}t^{\circledast }(\omega )e^{i\omega y}\int \;dx\;e^{-\alpha x^2}\psi _{n}(x)e^{-i\omega x}=0. \end{aligned}$$

(10)

The expression above is equivalent to the original eigenproblem in Eq. (2) and will provide the same $\psi $ modulo a first order polynomial in x (we will show the equivalence in the next paragraph where we specialize the template to natural images).

Indicating with $\mathfrak {F}$ the Fourier transform we can rewrite (10) as:

$$ \nu _{n}\xi ^{''}_{n}(y)+\sqrt{2\pi }\mathfrak {F}^{-1}\Big (\omega ^{2}t^{\circledast }(\omega )\mathfrak {F}\big (e^{-\alpha x^2}\psi _{n}(x)\big )\Big )=0. $$

Indicating with $*$ the convolution operator by the convolution theorem

$$f*g = \mathfrak {F}^{-1}(\mathfrak {F}(f)\mathfrak {F}(g)),\;\;\forall f,g\in L^{2}(\mathbb {R})$$

we have

$$ \nu _{n}\xi ^{''}_{n}(y)+\sqrt{2\pi }\mathfrak {F}^{-1}\big (\omega ^{2}t^{\circledast }(\omega )\big )*\big (e^{-\alpha x^2}\xi _{n}(x)\big )=0. $$

Expanding $\omega ^{2}t^{\circledast }(\omega )$ in Taylor series, $\omega ^{2}t^{\circledast }(\omega )=\sum _{i}c_{i}\omega ^{i}$ and remembering that $\mathfrak {F}^{-1}(\omega ^{m})=i^{m}\sqrt{2\pi }\delta ^{m}(x)$ we are finally lead to

$$\begin{aligned} \nu _{n}\xi ^{''}_{n}(y)+2\pi (c_{0}\delta + ic_{1}\delta ^{'}+\ldots )*\big (e^{-\alpha x^2}\xi _{n}(x)\big )=0. \end{aligned}$$

(11)

The differential equation so obtained is difficult to solve for a generic power spectrum. In the next paragraph we study the case where we can obtain explicit solutions.

Case: $1/\omega ^{2}$ Power Spectrum

In the case of average natural images power spectrum

$$ t^{\circledast }(\omega )=\frac{1}{\omega ^{2}} $$

the differential equation (11) assumes the particularly simple form

$$\begin{aligned} \nu _{n}\xi ^{''}_{n}(y)+2\pi e^{-\alpha y^2}\xi _{n}(y)=0. \end{aligned}$$

(12)

In the harmonic approximation, $e^{-\alpha y^2} \approx 1-\alpha y^2$ (valid for $\sqrt{\alpha }y\ll 1$) we have

$$\begin{aligned} \xi ^{''}_{n}(y)+\frac{2\pi }{\nu _{n}}(1-\alpha y^2\big )\xi _{n}(y)=0. \end{aligned}$$

(13)

The equation above is of the form of a so called Weber differential equation:

$$ \xi ^{''}(y)+(ay^{2}+by+c)\xi (y)=0,\;\;a,b,c\in \mathbb {R}$$

The general solutions of Eq. (12) are:

$$ \xi _{n}(y)= C_{1} D \Big (-\frac{1}{2}+\frac{\pi ^{\frac{1}{2}}}{\sqrt{2\alpha \nu _{n}}},\frac{2^{\frac{3}{4}}\alpha ^{\frac{1}{4}}\pi ^{\frac{1}{4}}}{{\nu _{n}}^{\frac{1}{4}}}\Big ) + C_{2} D \Big (-\frac{1}{2}-\frac{\pi ^{\frac{1}{2}}}{\sqrt{2\alpha \nu _{n}}},i\frac{2^{\frac{3}{4}}\alpha ^{\frac{1}{4}}\pi ^{\frac{1}{4}}}{{\nu _{n}}^{\frac{1}{4}}}\Big ) $$

where $D(\eta ,y)$ are parabolic cylinder functions and $C_{1},C_{2}$ are constants. It can be proved (Müller-Kirsten (2012), p. 139) that the solutions have two different behaviors, exponentially increasing or exponentially decreasing, and that we have exponentially decreasing real solutions if $C_{2}=0$ and the following quantization condition holds:

$$ -\frac{1}{2}+\frac{\pi ^{\frac{1}{2}}}{\sqrt{2\alpha \nu _{n}}}=n,\;n=0,1,\ldots $$

Therefore, remembering that $\alpha =1/\sigma ^{2}_{x}$, we obtain the spectrum quantization condition

$$\begin{aligned} \nu _{n}=2\pi \frac{\sigma ^{2}_{x}}{(2n+1)^{2}},\;n=0,1,\ldots \end{aligned}$$

(14)

Further, using the identity (true if $n\in \mathbb {N}$):

$$ D(n,y)=2^{-\frac{n}{2}}e^{-\frac{y^{2}}{4}}H_{n}\big (\frac{y}{\sqrt{2}}\big ) $$

where $H_{n}(y)$ are Hermite polynomials, we have :

$$ \xi _{n}(y) =2^{-\frac{n}{2}}e^{-\frac{2n+1}{2\sigma ^{2}_{x}}y^{2}}H_{n}\Big (\frac{\sqrt{2n+1}}{\sigma _{x}}y\Big ) $$

i.e.

$$\begin{aligned} \psi _{n}(y) = 2^{-\frac{n}{2}}e^{-\frac{n+1}{\sigma ^{2}_{x}}y^{2}}H_{n}\Big (\frac{\sqrt{2n+1}}{\sigma _{x}}y\Big ) \end{aligned}$$

(15)

Solutions plotted in Fig. 6 very well approximate Gabor functions.

Remark: The solution in Eq. (15) is also an approximate solution for any template spectrum such that

$$ \omega ^{2} t^{\circledast }(\omega )= const + O(\omega ) $$

This is important since it show how the solutions are robust to small changes of the power spectrum of the natural images.

Aperture Ratio

Using Eq. (15) the ratio between the width of the Gaussian aperture and that of the eigenfunctions can be calculated as

$$\begin{aligned} \frac{\sigma _{x}}{\sigma _{\psi _{n}}}= \sqrt{n+1} \end{aligned}$$

(16)

If we consider the first eigenfunction the ratio is $\sqrt{2}$.

Oscillating Behavior of Solutions

Although there isn’t an explicit expression for the frequency of the oscillation part of $\psi _{n}$ in Eq. (15) we can use the following approximation which calculates the frequency from the first crossing points of $H_{n}(x)$, i.e. $\pm \sqrt{2n+1}$, Boyd (1984). The oscillating part of (15) can be therefore written, in this approximation as

$$ \cos \big (\frac{2n+1}{\sigma _{x}}y-n\frac{\pi }{2}\big ) $$

which gives $\omega _{n} =(2n+1)/\sigma _{x}$. Using $\omega _{n}=2\pi /\lambda _{n}$ and (16) we finally have

$$ \frac{\lambda _{n}}{\sigma _{\psi _{n}}}=\frac{2\pi \sqrt{n+1}}{2n+1}. $$

The above equation gives an important information: for any fixed eigenfunction, $n=\bar{n}$ the number of oscillations under the gaussian envelope is constant.

Equivalence of the Integral and Differential Equations

To prove that the solutions of the eigenproblem (9) are equivalent to those of (13) we start with the observation that in the case of natural images power spectrum we can write explicitly (3):

$$\begin{aligned} t^{\circledast }(y-x)=\frac{1}{\sqrt{2\pi }}\int d\omega \;t^{\circledast }(\omega )e^{i\omega (y-x)}=\frac{1}{\sqrt{2\pi }}\int \;d\omega \;\frac{e^{i\omega (y-x)}}{\omega ^{2}}= -\sqrt{\frac{\pi }{2}}|y-x|. \end{aligned}$$

(17)

The integral equation (9) can be written for for $a>0,\;a\rightarrow \infty $

$$ \xi (y)+c\int _{-a}^{a}e^{-\alpha x^{2}}|y-x|\xi (x)dx $$

where for simplicity we dropped the index n and $c=\sqrt{\pi }/(\sqrt{2}\nu )$. Removing the modulus

$$ \xi (y)+c\int _{-a}^{y}e^{-\alpha x^{2}}(y-x)\xi (x)dx + c\int _{y}^{a}e^{\alpha x^{2}}(x-y)\xi (x)dx. $$

Putting $y=a,\;y=-a$ we can derive the boundary conditions

$$\begin{aligned}&\xi (a)+c\int _{-a}^{a}e^{-\alpha x^{2}}(a-x)\xi (x)dx\\&\xi (-a)+c\int _{-a}^{a}e^{-\alpha x^{2}}(a+x)\xi (x)dx \end{aligned}$$

Substituting, using the differential equation, $e^{-\alpha x^{2}}\xi (x)$ with $-\xi ''(x)\nu /2\pi $ and integrating by parts we have

$$\begin{aligned}&\xi (a)+ \xi (-a)+2ac'\xi '(-a) = 0\\&\xi (-a)+ \xi (-a)-2ac'\xi '(a) = 0 \end{aligned}$$

where $c'=1/2\sqrt{2}$. The above two boundary conditions together with the differential equation are equivalent to the initial integral eigenproblem. If we want bounded solutions at infinity: $\xi (\infty )=\xi (-\infty )=\xi '(\infty )=\xi '(-\infty )=0$.

1.4 1.4 Motion Determines a Consistent Orientation of the Gabor-Like Eigenfunctions

Consider a 2D image moving through time t, $I(x(t),y(t))=I(\mathbf{{x}}(t))$ filtered, as in pipeline of Fig. 4, by a spatial low-pass filter and a band-pass filter and call the output $f(\mathbf{{x}}(t))$.

Suppose now a temporal filter is done by a high-pass impulse response h(t). For example, let $h(t) \sim \frac{d}{dt}$. We consider the effect of the time derivative over the translated signal, $\mathbf{x}(t)=\mathbf{x}-\mathbf{v}t$ where $\mathbf {v}\in \mathbb {R}^2$ is the velocity vector

$$\begin{aligned} {\frac{{\text {d}}f(\mathbf{{x}}(t))}{{\text {d}}t}} = \nabla f (\mathbf{{x}}(t)) \cdot \mathbf {v}. \end{aligned}$$

(18)

If, for instance, the direction of motion is along the x axis with constant velocity, $\mathbf {v}=(v_{x},0)$, then Eq. (18) become

$$ \frac{{\text {d}} f (\mathbf{{x}}(t)) }{{\text {d}}t}= \frac{\partial f(\mathbf{{x}}(t))}{\partial x} v_{x}, $$

or, in Fourier domain of spatial and temporal frequencies:

$$\begin{aligned} \widehat{f}(i {\omega _t}) = i v_{x}\omega _{x}\hat{f}. \end{aligned}$$

(19)

Consider now an image I with a symmetric spectrum $1/(\sqrt{\omega ^{2}_{x}+\omega ^{2}_{y}})$. Equation (19) shows that the effect of the time derivative is to break the radial symmetry of the spectrum in the direction of motion (depending on the value of $v_{x}$). Intuitively, spatial frequencies in the x direction are enhanced. Thus motion effectively selects a specific orientation since it enhances the frequencies orthogonal to the direction of motion in Eq. (1).

Thus the theory suggests that motion effectively “selects” the direction of the Gabor-like function (see previous section) during the emergence and maintenance of a simple cell tuning. It turns out that in addition to orientation other features of the eigenvectors are shaped by motion during learning. This is shown by an equivalent simulation to that presented in Fig. 1 but in which the order of frames was scrambled before the time derivative stage. The receptive fields are still Gabor-like functions but lack the important property of having $\sigma _x \propto \lambda $. This is summarized in Fig. 7.

The theory also predicts—assuming that the cortical equation provides a perfect description of Hebbian synapses—that the even eigenfunctions have slightly different $n_x, n_y$ relations than odd eigenfunctions. It is unlikely that experiments data may allow to distinguish this small difference.

1.5 1.5 Phase of Gabor RFs

We do not analyze here the phase for a variety of reasons. The main reason is that phase measurements are rather variable in each species and across species. Phase is also difficult to measure. The general shape shows a peak in 0 and a higher peak at 90. These peaks are consistent with the $n=2$ eigenfunctions (even) and the $n=1$ eigenfunctions (odd) of Eq. 2 (the zero-th order eigenfunction is not included in the graph). The relative frequency of each peak would depend, according to our theory, on the dynamics (Oja equation) of learning and on the properties of the lateral inhibition between simple cells (to converge to eigenfunctions other than the first one). It is in any case interesting that the experimental data fit qualitatively our predictions: the $\psi _1$ odd eigenfunction of the cortical equation should appear more often (because of its larger power) than the even $\psi _2$ eigenfunction and no other ones with intermediate phases should exist—at least in the noise-less case (Fig. 8).

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mutch, J., Anselmi, F., Tacchetti, A., Rosasco, L., Leibo, J.Z., Poggio, T. (2017). Invariant Recognition Predicts Tuning of Neurons in Sensory Cortex. In: Zhao, Q. (eds) Computational and Cognitive Neuroscience of Vision. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-0213-7_5

Download citation

DOI: https://doi.org/10.1007/978-981-10-0213-7_5
Published: 04 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0211-3
Online ISBN: 978-981-10-0213-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Invariant Recognition Predicts Tuning of Neurons in Sensory Cortex

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Appendix

1 Appendix

1.1 1.1 Retinal Processing

1.2 1.2 Additional Evidence for Gabor Shapes as Templates in V1

1.3 1.3 Hebbian Rule and Gabor-Like Functions

1.3.1 1.3.1 Approximate Anzatz Solution for Piecewise Constant Spectrum

1.3.2 1.3.2 Differential Equation Approach

1.4 1.4 Motion Determines a Consistent Orientation of the Gabor-Like Eigenfunctions

1.5 1.5 Phase of Gabor RFs

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation