Cortically Based Optimal Transport

Galeotti, Mattia; Citti, Giovanna; Sarti, Alessandro

doi:10.1007/s10851-022-01116-9

Cortically Based Optimal Transport

Open access
Published: 30 July 2022

Volume 64, pages 1040–1057, (2022)
Cite this article

Download PDF

You have full access to this open access article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Cortically Based Optimal Transport

Download PDF

1855 Accesses
2 Altmetric
Explore all metrics

Abstract

We introduce a model for image morphing in the primary visual cortex V1 to perform completion of missing images in time. We model the output of simple cells through a family of Gabor filters and the propagation of the neural signal accordingly to the functional geometry induced by horizontal connectivity. Then we model the deformation between two images as a path relying two different outputs. This path is obtained by optimal transport considering the Wasserstein distance geodesics associated to some probability measures naturally induced by the outputs on V1. The frame of Gabor filters allows to project back the output path, therefore obtaining an associated image stimulus deformation. We perform a numerical implementation of our cortical model, assessing its ability in reconstructing rigid motions of simple shapes.

Time Discrete Geodesics in Deep Feature Spaces for Image Morphing

Modelling of the Poggendorff Illusion via Sub-Riemannian Geodesics in the Roto-Translation Group

Image Morphing in Deep Feature Spaces: Theory and Applications

Article Open access 19 July 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The functional geometry of the visual cortex is a widely studied subject. It is known that cells of the primary visual cortex V1 are sensitive to specific features of the visual stimulus, like position, orientation, scale, colour, curvature, velocity and many others [17]. In the 1970s, the neurophysiologists Hubel and Wiesel discovered the modular organisation of the primary visual cortex [18], meaning that cells are spatially organized in such a way that for every point (x, y) of the retinal plane there is an entire set of cells, each one sensitive to a particular instance of the considered feature. This organisation corresponds to the so-called hypercolumnar structure. Hypercolumns of cells are then connected to each other by means of the horizontal connectivity, allowing cells of the same kind but sensitive to different points (x, y) of the stimulus to communicate. Hypercolumnar organization and neural connectivity between hypercolumns constitute the functional architecture of the visual cortex, that is the cortical structure underlying the low level processing of the visual stimulus. The mathematical modelling of the functional architecture of the visual cortex in terms of differential geometry was introduced in the seminal works of Hoffmann [15, 16], who proposed to model the hypercolumnar organization in terms of a fibre bundle. Many of such results dealing with differential geometry were given a unified framework under the new name of neurogeometry.

Petitot and Tondut [23], related the contact geometry introduced by Hoffmann with the geometry of illusory contours of Kanizsa [19]. The problem of completion of occluded object was afforded by computing geodesic curves in the contact structure.

Then, in [8] Citti and Sarti showed how the functional architecture could be described in terms of Lie groups structures. In particular, as proved by Daugman [11] the receptive profiles of simple cells can be modelled as a Gabor filter. Since these filters can be obtained via rotation and translation from a fixed one, the functional architecture of the whole family of simple cells has been described as the Euclidean motion group SE(2) [8]. In the presence of a visual stimulus $I: {\mathbb {R}}^2 \rightarrow [0, 1]$ on the retinal plane, the action of the whole family of simple cells is obtained by convolving the function I with the bank of Gabor filters. The output of the cells action will be a function $\mu : {\mathbb {R}}^2 \times S^1 \rightarrow {\mathbb {R}}$. The horizontal connectivity is strongly anisotropic, and it is modelled via a sub-Riemannian metric. Since it is very common that part of the visual input is occluded, the action of horizontal connectivity allows to complete the missing part by means of propagation in such a space, under the action of advection diffusion differential operators of Fokker-Planck type. Visual completion problems are then solved via geodesics or minimal surfaces. This approach was extended to scale in [29], to space-time in [4] and to frequency in [6]. In [13] and [30], the lifting has been extended to heterogeneous features defined in different groups. For an extended review on neurogeometry, see [9].

In this paper, we aim to reconsider the problem of completion of missing stimulus in time by means of morphing of one lifted cortical image in a different one in terms of optimal transport of a probability distribution in the functional geometry of the cortex. Two images can represent the same object at two different intervals of times, and different algorithms have been proposed to perform completion of missing images between the two. We recall the results of [32] and the model proposed by [33] on the image plane as a geodesic in the Wasserstein space.

We propose a cortical version of this phenomena, using geodesics in the cortical space endowed with the Wasserstein distance. We work in the manifold $M = {\mathbb {R}}^2 \times S^1\times {\mathbb {R}}^+$ of cells sensible to position orientation and scale (see also [29]), and we develop a model that treats well shape rotations. We point out that the sub-Riemannian metric in ${\mathbb {R}}^2\times S^1$ is important to keep together the shape of the object along the rotational movements. In fact, the distance function in this metric approximates the statistical correlations of boundaries in natural images [28].

In order to obtain a metric deformation, we consider the positive and negative part of the output $\mu $ at a given time, normalized as two probability measures $\mu ^+,\mu ^-$ over M. Following the papers [2, 3] of Ambrosio and Gigli, we consider the space ${\mathcal {P}}_2(X)$ of probabilities with finite 2-momentum over $X={\mathbb {R}}^2\times S^1\times {\mathbb {R}}$, endowed with the associated Wasserstein distance; on this space, we can find, for any pair of inputs $I_0,I_1$ taken at different times, a unique constant speed geodesic relying their associated output measures.

In particular, for any regular measure $\mu \in {\mathcal {P}}_2(X)$, we obtain Theorem 3.15, a generalization of previously known results, and it assures that for any measure $\nu \in {\mathcal {P}}_2(X)$ there exists a unique transport map T in the sense of Monge’s formulation of optimal transport. Using this transport map, it is possible to give an explicit description of the constant speed geodesics in ${\mathcal {P}}_2(X)$; in the case of the measures induced by the output functions, this is done in Remark 4.4 and Eq. (6.3). Using the frame properties of the family of Gabor filters, we reconstruct with Eq. (5.5) a path of images $I_t$ relying the initial and final input $I_0,I_1$.

In the two final sections, we develop a numerical implementation of our model. In fact, the frame generated by the odd Gabor mother function is not invertible in a discrete setting, and in order to make our model workable, we use a Wavelet Gabor Pyramid generated by an odd and an even Gabor function. The outputs obtained via these frames are transported following the same approach detailed above. As shown in Sect. 9, our model allows the deformation of simple shapes through rotation and translation, preserving the basic structures in the treated pictures. As we discuss, the same preservation is not attained by the numerical 2-dimensional ‘classical’ implementation of optimal transport, meaning the implementation of the optimal transport between the inputs $I_0,I_1$ seen as measures on ${\mathbb {R}}^2$ and taking the square Euclidean distance as the cost function.

In Sect. 2, we introduce the operation associating to any input an output function via the convolution with the family of Gabor filters. In Sect. 3, the classical problem of optimal transport is introduced, with the techniques necessary for a general solution. In Sect. 4, we describe the constant speed geodesics in ${\mathcal {P}}_2(X)$. In Sect. 5, we treat the properties of continuous frames such as the one of Gabor functions; they are fundamental in reconstructing a path of input images from the deformation path of output functions. Section 6 states that the measure deformation results are valid in our case. Finally, in Sect. 7 we find a constraining condition implying that the path of output functions $\mu _t$ is naturally induced by a path of input images $I_t$. In Sect. 8, we introduce the discrete setting and the Wave Gabor Pyramid. Finally, in Sect. 9 we discuss the numerical implementation and its results.

2 From the Retina to the Output Space

We consider an input function $I:{\mathbb {R}}^2\rightarrow [0,1]$ in $L^2({\mathbb {R}}^2)$. This functions models an input received in the retina plan, and it induces an output function on the cortex. In order to introduce the output function, we recall the odd series of Gabor filters. We call Gabor mother filter the function

$$\begin{aligned} \psi _{0,0,0,1}:({\tilde{x}},{\tilde{y}})\in {\mathbb {R}}^2\mapsto e^{-({\tilde{x}}^2+{\tilde{y}}^2)}\cdot \sin (2{\tilde{y}}), \end{aligned}$$

Moreover, we consider the roto-dilation defined by

$$\begin{aligned} A_{\theta ,\sigma }:=\sigma \cdot R_\theta =\sigma \cdot \left( \begin{array}{cc} \cos \theta &{}-\sin \theta \\ \sin \theta &{}\cos \theta \\ \end{array}\right) , \end{aligned}$$

for any $\theta \in S^1$ and $\sigma \in {\mathbb {R}}^+$. We also consider the application

$$\begin{aligned} A_{x,y,\theta ,\sigma }:({\tilde{x}},{\tilde{y}})\mapsto A_{\theta ,\sigma }({\tilde{x}},{\tilde{y}})+(x,y). \end{aligned}$$

All this allows the definition of a family of Gabor filters

$$\begin{aligned} \psi _{x,y,\theta ,\sigma }({\tilde{x}},{\tilde{y}})&:=\frac{1}{\sigma ^{3/2}}\cdot \psi _{0,0,0,1}\left( A_{\theta ,\sigma }^{-1}({\tilde{x}}-x,{\tilde{y}}-y)\right) \nonumber \\&=\frac{1}{\sigma ^{3/2}}\cdot \psi _{0,0,0,1}\left( A_{x,y,\theta ,\sigma }^{-1}({\tilde{x}},{\tilde{y}})\right) . \end{aligned}$$

In what follows, we will denote the filter $\psi _{0,0,0,1}$ by $\psi _0$ when there is no risk of confusion. We consider the variety $M:= {\mathbb {R}}^2\times S^1\times {\mathbb {R}}^+$ with its natural (Lebesgue) measure dk; this is the output space where we build the $\mu $ function induced by the input I. For any point $k=(x,y,\theta ,\sigma )\in M$, we denote by $\psi ^k$ the Gabor filter $\psi _{x,y,\theta ,\sigma }$.

Definition 2.1

Consider a point $k=(x,y,\theta ,\sigma )$ on M, then the output function of a cell in response to the visual input I is

$$\begin{aligned} \mu (x,y,\theta ,\sigma )=\mu (k)&:=\langle I,\psi ^k\rangle \\&=\int _{{\mathbb {R}}^2}I({\tilde{x}},{\tilde{y}})\psi _{x,y,\theta ,\sigma } ({\tilde{x}},{\tilde{y}})\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}.\\ \end{aligned}$$

Remark 2.2

We work in the case $M={\mathbb {R}}^2\times S^1\times {\mathbb {R}}^+$ and we put on M the Riemannian structure which endows the neurogeometry of the cortex. In particular, the ${\mathbb {R}}^2$ factor is the retinal plan with the natural projection $M\rightarrow {\mathbb {R}}^2$, $\theta \in S^1$ is an angular parameter that encodes the border orientation in the processes of border recognition, while $\sigma \in {\mathbb {R}}^+$ is a scale parameter in the same process.

In order to define this metric g, consider the following four vector fields, in every point they span the M tangent bundle,

$$\begin{aligned} X_1&=\cos \theta \cdot \partial _x+\sin \theta \cdot \partial _y,\\ X_2&=\partial _\theta ,\\ X_3&=-\sin \theta \cdot \partial _x+\cos \theta \cdot \partial _y,\\ X_4&= \partial _\sigma . \end{aligned}$$

Let g be the metric such that at any point $ \frac{1}{\sqrt{\sigma }}X_1,\ \frac{1}{\sqrt{\sigma }}X_2,\ \sqrt{\sigma }X_3,\ \sqrt{\sigma }X_4$ is an orthonormal system. That is at every point the metric g is represented by the matrix

$$\begin{aligned} {\tilde{g}}=\left( \begin{array}{cccc} \sigma &{}0 &{}0 &{}0\\ 0 &{}\sigma &{}0&{}0\\ 0 &{}0 &{}\frac{1}{\sigma } &{}0\\ 0&{} 0&{} 0&{} \frac{1}{\sigma }\\ \end{array}\right) \end{aligned}$$

For $\sigma $ that tends to 0, g tends to the sub-Riemannian structure treated in [8] (module rescaling), for $\sigma \rightarrow + \infty $ it approaches a hyperbolic metric.

In this setting, the measure $\mu \cdot dk=\langle I,\psi ^k\rangle \cdot dk$ on M, is $\langle I,\psi ^k\rangle $ times the measure induced by the metric g.

As proved in “Appendix A”, the integral

$$\begin{aligned} \int _{M}\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}k \end{aligned}$$

has finite value 0, independently of the pair $({\tilde{x}},{\tilde{y}})$.

Remark 2.3

The mass of the output function $\mu $ on M is proportional to the mass of I and therefore it is null. Indeed,

$$\begin{aligned} \int _{M}\mu (k)\mathrm{d}k&=\int _{M}\mathrm{d}k\int _{{\mathbb {R}}^2}I({\tilde{x}},{\tilde{y}})\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}\\&=\int _{{\mathbb {R}}^2}I({\tilde{x}},{\tilde{y}})\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}\int _{M}\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}k=0 \end{aligned}$$

by the Fubini’s theorem.

3 The Optimal Transport Problem

We recall the classical Kantorovich’s formulation of optimal transport. Our main reference is the Ambrosio-Gigli guide [3]. For any Polish space X (i.e. a complete and separable metric space) we denote by ${\mathcal {P}}(X)$ the set of Borel probability measures on X and by $\mathcal B(X)$ the set of Borel sets on X. Consider two Polish spaces X, Y, if $\mu \in {\mathcal {P}}(X)$ and $T:X\rightarrow Y$ is a Borel map, then we denote by $T_\#\mu \in {\mathcal {P}}(Y)$ the pushforward of $\mu $ through T, defined by

$$\begin{aligned} T_\#\mu (E)=\mu (T^{-1}E)\ \ \ \forall E\in \mathcal B(Y). \end{aligned}$$

Consider the natural product $X\times Y$ and its associated projections $\pi ^X,\pi ^Y$. Let $c:X\rightarrow Y$ be a Borel map called cost function, and consider two measures $\mu \in \mathcal {P}(X)$ and $\nu \in {\mathcal {P}}(Y)$.

Definition 3.1

An admissible transport plan between $\mu $ and $\nu $ is a measure $\gamma \in \mathcal {P}(X\times Y)$ such that $\pi ^X_\#\gamma =\mu $ and $\pi ^Y_\#\gamma =\nu $, or equivalently

$$\begin{aligned} \gamma (A\times Y)&=\mu (A)\ \ \forall A\in \mathcal {B}(X)\\ \gamma (X\times B)&=\nu (B)\ \ \forall B\in \mathcal {B}(Y). \end{aligned}$$

We denote the set of admissible transport plans between $\mu $ and $\nu $ by ${{\,\mathrm{Adm}\,}}(\mu ,\nu )$.

We want to minimize the integral

$$\begin{aligned} \int _{X\times Y}c(x,y)\mathrm{d}\gamma (x,y) \end{aligned}$$

for all the admissible transport plans between $\mu $ and $\nu $. We say that $\gamma $ is induced by a transport map if there exists a Borel map $T:X\rightarrow Y$ such that $\gamma =({{\,\mathrm{id}\,}}\times T)_\#\mu $, in that case

$$\begin{aligned} \int _{X\times Y}c(x,y)\mathrm{d}\gamma =\int _Xc(x,T(x))\mathrm{d}\mu . \end{aligned}$$

An optimal transport plan is a transport plan $\gamma $ that realizes the infimum above. It is known that such a minimizer exists under very general conditions. We denote the set of optimal plans by ${{\,\mathrm{Opt}\,}}(\mu ,\nu )$.

Proposition 3.2

(see [31, Theorem 4.1] Consider $\mu \in {\mathcal {P}}(X)$ and $\nu \in {\mathcal {P}}(Y)$. If the cost function c is lower semicontinuous and bounded from below, then there exists an optimal plan $\gamma $ for the functional

$$\begin{aligned} \gamma \mapsto \int _{X\times Y}c(x,y)\mathrm{d}\gamma (x,y), \end{aligned}$$

among all $\gamma \in {{\,\mathrm{Adm}\,}}(\mu ,\nu )$.

We are interested in the cases where an optimal plan is induced by a transport map T. We state [3, Lemma 1.20], referring to Ambrosio-Gigli paper for a proof.

Lemma 3.3

Consider $\gamma \in {{\,\mathrm{Adm}\,}}(\mu ,\nu )$. Then $\gamma $ is induced by a map if and only if $\gamma $ is concentrated in a measurable set $\Gamma \subset X\times Y$ such that for $\mu $-a.e. x there exists only one $y=T(x)$ in $\Gamma \cap (\{x\}\times Y)$. In this case, T(x) induces $\gamma $.

In order to introduce the notion of c-concavity for some cost function c, and in order to show the existence and uniqueness of an optimal transport plan $\gamma $ induced by a transport map T, we give some definitions following [3, Chap.1].

Definition 3.4

(Superdifferential) Consider M a Riemannian manifold and any function $\varphi :M\rightarrow {\mathbb {R}}$, we define its superdifferential at any point $x\in M$,

$$\begin{aligned} \partial ^+\varphi (x):= & {} \left\{ dh(x)\in T_x^*M:\ h\in {\mathcal {C}}^1(M,{\mathbb {R}}),\ \varphi \right. \nonumber \\&\left. -h \text{ attains } \text{ a } \text{ local } \text{ maximum } \text{ at } x\right\} . \end{aligned}$$

When there is no risk of confusion, we will denote by $\partial ^+\varphi $ the associated subspace of the total space $T^*M$. The subdifferential $\partial ^-\varphi $ is defined analogously as the set of differentials dh(p) where $\varphi -h$ attains a local minimum of $\varphi -h$.

Remark 3.5

Equivalently, $\partial ^+\varphi (x)$ is the set of vectors $v\in T_xM$ such that

$$\begin{aligned} \varphi (z)-\varphi (x)\le \langle v,\exp ^{-1}_x(z)\rangle +o(d(x,z)). \end{aligned}$$

The same for $\partial ^-\varphi (x)$ with inversed inequality. With this definition, $\partial ^+\varphi $ and $\partial ^-\varphi $ are subspaces of TM.

It is well known that if $\varphi $ is differentiable at $x\in M$, its superdifferential and subdifferential at x coincide and contain only the $\varphi $ gradient,

$$\begin{aligned} \partial ^+\varphi (x)=\partial ^-\varphi (x)=\{\nabla \varphi (x)\}. \end{aligned}$$

Consider two Polish spaces X, Y, and a cost function $c:X\times Y\rightarrow {\mathbb {R}}$.

Definition 3.6

(c-transforms) Consider a function $\varphi :X\rightarrow {\mathbb {R}}\cup \{\pm \infty \}$, its $c_+$-transform $\varphi ^{c_+}:Y\rightarrow {\mathbb {R}}\cup \{\pm \infty \}$ is defined as

$$\begin{aligned} \varphi ^{c_+}(y):=\inf _{x\in X}c(x,y)-\varphi (x). \end{aligned}$$

Analogously for any $\psi :Y\rightarrow {\mathbb {R}}\cup \{\pm \infty \}$, we can define its $c_+$-transform $\psi ^{c_+}:X\rightarrow {\mathbb {R}}\cup \{\pm \infty \}$.

The $c_-$-transform of $\varphi $ is $\varphi ^{c_-}:Y\rightarrow {\mathbb {R}}\cup \{\pm \infty \}$ defined as

$$\begin{aligned} \varphi ^{c_-}(y):=\sup _{x\in X}-c(x,y)-\varphi (x). \end{aligned}$$

Analogously for the $c_-$-transform $\psi ^{c_-}$ of $\psi $.

Definition 3.7

(c-concavity) We say that a function $\varphi :X\rightarrow {\mathbb {R}}\cup \{-\infty \}$ is c-concave if there exists $\psi :Y\rightarrow {\mathbb {R}}\cup \{-\infty \}$ such that $\varphi =\psi ^{c_+}$. Analogously we have a notion of c-convexity.

Definition 3.8

(Semiconcavity) A function $f:U\rightarrow {\mathbb {R}}$ whose domain is a convex subset U of a Riemannian manifold M, is semiconcave with constant K if for every geodesic $\gamma :[0,1]\rightarrow U$ and $t\in [0,1]$, we have

$$\begin{aligned}&t\cdot f(\gamma _0)+(1-t)\cdot f(\gamma _1)\\&\le f(\gamma _t)+\frac{1}{2}\cdot t(1-t) K\cdot d^2(\gamma _0,\gamma _1). \end{aligned}$$

With the notation of [31, Chap.10], the definition above describes a semiconcave function with modulus $\omega (t)=K\frac{t^2}{2}$.

Remark 3.9

Observe that by [31, Equation (10.14)], if f is semiconcave, superdifferentiable at x and $q\in \partial ^+f(x)$, then

$$\begin{aligned} f(\exp _x w)-f(x)\le \langle q, w\rangle -\frac{1}{2}K\left\Vert w\right\Vert ^2. \end{aligned}$$

Definition 3.10

(c-superdifferential) Consider $\varphi :X\rightarrow {\mathbb {R}}\cup \{-\infty \}$ a c-concave function, then its c-superdifferential $\partial ^{c_+}\varphi \subset X\times Y$ is defined as

$$\begin{aligned} \partial ^{c_+}\varphi :=\left\{ (x,y):\ \varphi (x)+\varphi ^{c_+}(y)=c(x,y)\right\} . \end{aligned}$$

We denote by $\partial ^{c_+}\varphi (x)$ the set of $y\in Y$ such that $(x,y)\in \partial ^{c_+}\varphi $. Analogously we can define the c-subdifferential $\partial ^{c_-}\varphi \subset X\times Y$.

Consider two probability measures $\mu \in {\mathcal {P}}(X)$ and $\nu \in {\mathcal {P}}(Y)$. In the following, we will consider a cost function $c:X\times Y\rightarrow {\mathbb {R}}$ such that there exists two functions $a\in L^1(\mu )$, $b\in L^1(\nu )$, respecting the inequality below.

$$\begin{aligned} c(x,y)\le a(x)+b(y). \end{aligned}$$

(3.1)

Theorem 3.11

[3, Theorem 1.13]) Consider $\mu \in {\mathcal {P}}(X)$, $\nu \in {\mathcal {P}}(Y)$ and $c:X\times Y\rightarrow {\mathbb {R}}$ a continuous and bounded from below cost function such that there exist two functions $a\in L^1(\mu )$ and $b\in L^1(\nu )$ verifying condition (3.1). Then there exists a c-concave function $\varphi :X\rightarrow {\mathbb {R}}$ such that $\varphi \in L^1(\mu )$, $\varphi ^{c_+}\in L^1(\nu )$ and for any optimal plan $\gamma \in {{\,\mathrm{Opt}\,}}(\mu ,\nu )$,

$$\begin{aligned} {{\,\mathrm{supp}\,}}(\gamma )\subset \partial ^{c_+}\varphi . \end{aligned}$$

Consider the manifold $X={\mathbb {R}}^2\times S^1\times {\mathbb {R}}$ with the Lebesgue metric. Consider a function $\varphi :X\rightarrow {\mathbb {R}}$ and the square distance $c(x,y)=d^2(x,y)/2$ as cost function, then the following lemma and proposition state the link between the superdifferential $\partial ^+\varphi $ and the c-superdifferential $\partial ^{c_+}\varphi $.

Lemma 3.12

For any point $y\in X$, the function $d^2(-,y)/2$ is uniformly semiconcave on X.

This is proven following the line of reasoning of [31, Third Appendix], because X is flat and therefore its sectional curvature is everywhere 0.

Proposition 3.13

Consider a c-concave function $\varphi :X\rightarrow {\mathbb {R}}$, then it must be semiconcave. Furthermore, for any $x\in X$, $\exp _x^{-1}(\partial ^{c_+}\varphi (x))\subset -\partial ^+\varphi (x)$.

Proof

This is a slight generalization of [3, Proposition 1.30], and we develop the same arguments. As stated by Lemma 3.12, for any $y\in X$ the distance function $d^2(-,y)/2$ is semiconcave, therefore by Remark 3.9 this implies

$$\begin{aligned} \frac{\mathrm{d}^2(z,y)}{2}-\frac{\mathrm{d}^2(x,y)}{2}\le -\langle v,\exp _x^{-1}(z)\rangle +o(d(x,z)), \end{aligned}$$

because if $v\in \exp _x^{-1}(y)$, then $-v$ is in the superdifferential of $d^2(-,y)/2$ at x.

If we take $d^2(x,y)/2$ as the cost function c(x, y) and consider $y\in \partial ^{c_+}\varphi (x)$, therefore by definition $\varphi (z)-c(z,y)\le \varphi (x)-c(x,y)$ for any $z\in X$. As a consequence

$$\begin{aligned} \varphi (z)-\varphi (x)\le & {} \frac{\mathrm{d}^2(z,y)}{2}-\frac{\mathrm{d}^2(x,y)}{2}\nonumber \\\le & {} \langle -v,\exp _x^{-1}(z)\rangle +o(d(x,z)), \end{aligned}$$

that means $-v\in \partial ^+\varphi (x)$. $\square $

Definition 3.14

(Regular measure) We say that a measure $\mu \in {\mathcal {P}}(X)$ is regular if it vanishes on the set of points of non differentiability of any semiconcave function $\varphi :X\rightarrow {\mathbb {R}}$.

For any Polish space (X, d), we introduce the space of probability measures on X with finite 2-momentum with respect to d,

$$\begin{aligned}&{\mathcal {P}}_2(X):\\&=\left\{ \mu \in {\mathcal {P}}(X):\ \right. \nonumber \\&\left. \int \mathrm{d}^2(x,x_0)\mathrm{d}\mu <\infty \ \text{ for } \text{ some, } \text{ and } \text{ thus } \text{ any, } x_0\in X\right\} . \end{aligned}$$

We regard ${\mathcal {P}}_2(X)$ as a metric space with respect to the sup norm.

Theorem 3.15

Consider the Riemannian manifold X and a probability measure $\mu \in {\mathcal {P}}_2(X)$. If $\mu $ is regular, then for every $\nu \in {\mathcal {P}}_2(X)$ there exists only one transport plan from $\mu $ to $\nu $ and it is induced by a map T. If this is the case, the map T can be written as $x\mapsto \exp _x(-\nabla \varphi (x))$ for some c-concave function $\varphi :X\rightarrow {\mathbb {R}}$.

Observe that this is a generalization of [3, Theorem 1.33] to the case of the non-compact Riemannian manifold X. For another reference see also [14].

Proof

In order to apply Theorem 3.11, we are going to verify that condition (3.1) is respected. In particular, we want to show that taking $a(x)=d^2(x,x_0)$ and $b(y)=d^2(y,x_0)$ for any $x_0\in M$, then $a\in L^1(\mu )$, $b\in L^1(\nu )$ and $c(x,y)=d^2(x,y)/2\le d^2(x,x_0)+d^2(y,x_0)$. The inequality is proved by

$$\begin{aligned}&d^2(x,x_0)+d^2(y,x_0)\\&\ge \frac{\mathrm{d}^2(x,x_0)+d^2(y,x_0)}{2}+|d(x,x_0)d(y,x_0)|\\&=\frac{1}{2}(d(x,x_0)+d(y,x_0))^2\ge \frac{\mathrm{d}^2(x,y)}{2}. \end{aligned}$$

To have $a\in L^1(\mu )$ (and therefore $b\in L^1(\nu )$) it suffices to have $\int _M\mathrm{d}^2(x,x_0)\mu (x)< \infty $ but this is exactly the definition of $\mu \in {\mathcal {P}}_2(X)$.

Thus as a consequence of Theorem 3.11, there exists a c-concave function $\varphi $ such that any optimal plan $\gamma $ is concentrated on $\partial ^{c_+}\varphi $. By Proposition 3.13, $\varphi $ is semiconcave and therefore differentiable $\mu $-a.e. by $\mu $-regularity. If $\varphi $ is differentiable at x, then $\partial ^+\varphi (x)$ is the singleton $\{\nabla \varphi (x)\}$, therefore $\partial ^{c_+}\varphi (x)$ is empty or equals $\exp _x(-\nabla \varphi (x))$. We define $\mu $-a.e. the function $T(x)=\exp _x(-\nabla \varphi (x))$. As ${{\,\mathrm{supp}\,}}(\gamma )\subset \partial ^{c_+}\varphi $, we must have that T induces $\gamma $, concluding the proof. $\square $

4 Geodesics in ${\mathcal {P}_2(X)}$

The introduction of the Wasserstein distance $W_2$ on $\mathcal P_2$ allows the definition of geodesics in this space. We show a theorem of existence and uniqueness for such a geodesic relying a pair of measures.

Definition 4.1

Consider a metric space (X, d). A curve $\gamma :[0,1]\rightarrow X$ is a constant speed geodesic if

$$\begin{aligned} d(\gamma _s,\gamma _t)=|t-s|d(\gamma _0,\gamma _1)\ \ \forall t,s\in [0,1]. \end{aligned}$$

We recall that (X, d) is called a geodesic space if for every $x,y\in X$ there exists a constant speed geodesic connecting them. We consider the metric space ${{\,\mathrm{Geod}\,}}(X)$ of constant speed geodesics endowed with the sup norm.

On ${{\,\mathrm{Geod}\,}}(X)$ we introduce for any $t\in [0,1]$ the map $e_t:{{\,\mathrm{Geod}\,}}(X)\rightarrow X$ such that

$$\begin{aligned} e_t:\gamma \mapsto \gamma _t. \end{aligned}$$

Furthermore we define the Wassertein distance associated to d on ${\mathcal {P}}_2(X)$.

Definition 4.2

(Wasserstein distance) If $\mu ,\nu \in {\mathcal {P}}_2(X)$, then

$$\begin{aligned} W_2^2(\mu ,\nu ):=\inf _{\gamma \in {{\,\mathrm{Adm}\,}}(\mu ,\nu )}\int \mathrm{d}^2(x,y)\mathrm{d}\gamma . \end{aligned}$$

The following theorem is proved in [3]. For any two probability measure $\mu _0,\mu _1$ on a Polish space, this gives a constant speed geodesic relying them.

Theorem 4.3

If (X, d) is Polish and geodesic, then $({\mathcal {P}}_2(X), W_2)$ is geodesic too. Furthermore, consider $\mu _0,\mu _1\in {\mathcal {P}}_2(X)$ and a path $t\mapsto \mu _t\in {\mathcal {P}}_2(X)$ from $\mu _0$ to $\mu _1$, then $\mu _t$ is a constant speed geodesic if and only if there exists $\varvec{\mu }\in {\mathcal {P}}_2({{\,\mathrm{Geod}\,}}(X))$ such that $(e_0,e_1)_\#\varvec{\mu }\in {{\,\mathrm{Opt}\,}}(\mu _0,\mu _1)$ and $\mu _t=(e_t)_\#\varvec{\mu }$.

We consider on X the Lebesgue measure dk. The measures $\mu $ that we are going to treat are always induced by density a.e.-continuous functions.

Remark 4.4

We know that the geodesic relying two points on a complete Riemannian manifold X is almost everywhere unique (see for example [31]). This means that the set of pairs $(x,y)\in X^2$ such that the geodesic between them is unique has full measure. The same is true for any measure $\mu $ on X if it is induced by a density a.e.-continuous function.

Therefore the maps $e_t$ naturally induces almost everywhere a map that we denote in the same way:

$$\begin{aligned} e_t:X^2\rightarrow X, \end{aligned}$$

sending the pair (x, y) to the point $\gamma _t$ where $\gamma $ is the geodesic such that $\gamma _0=x$ and $\gamma _1=y$.

Consider two probability measures $\mu _0,\mu _1$ over X respecting the hypothesis of Theorem 3.15. Let T be the transport map between them, then as a consequence of Theorem 4.3 the (unique, coming from the uniqueness of T) geodesic between $\mu _0$ and $\mu _1$ can be written as

$$\begin{aligned} \mu _t=(e_t\circ ({{\,\mathrm{id}\,}},T))_\#\mu _0. \end{aligned}$$

In what follows, we will use the notation $e_t^{(T)}:=e_t\circ ({{\,\mathrm{id}\,}},T)$, and therefore we will have $\mu _t=(e_t^{(T)})_\#\mu _0$.

5 Reconstructing the Visual Input via the Gabor Frame

We are going to introduce the notion of continuous frame, this allows the reconstruction of a function I on the retinal plane, from the datum of an output function $\mu $.

Definition 5.1

Consider a Hilbert space $\mathcal {H}$ and a measure space M with a positive measure $\rho $. A continuous frame is a family of vectors $\{\psi ^k\}_{k\in M}$ such that $k\mapsto \langle f, \psi ^k\rangle $ is a measurable function on M for any $f\in \mathcal {H}$, and there exists $A,B>0$ such that

$$\begin{aligned} A\cdot \left\Vert f\right\Vert ^2\le \int _X|\langle f,\psi ^k\rangle |^2\mathrm{d}\rho (k)\le B\cdot \left\Vert f\right\Vert ^2. \end{aligned}$$

Consider $f,g\in \mathcal {H}$ and the mapping

$$\begin{aligned} h_f:g\mapsto \int _M\langle f,\psi ^k\rangle \langle \psi ^k,g\rangle \mathrm{d}\rho (k). \end{aligned}$$

This map is conjugated linear and moreover it is bounded. Indeed,

$$\begin{aligned} |h_f(g)|^2\le & {} \int _M|\langle f,\psi ^k\rangle |^2\mathrm{d}\rho (k)\cdot \int _X|\langle \psi ^k,g\rangle |^2\mathrm{d}\rho (k)\nonumber \\\le & {} B^2\left\Vert f\right\Vert ^2\left\Vert g\right\Vert ^2. \end{aligned}$$

(5.1)

By the Riesz’ representation theorem, there exists a unique element ${\underline{h}}\in \mathcal {H}$ which verifies $h_f=\langle {\underline{h}},-\rangle $. We denote this element by $\int _X\langle f,\psi ^k\rangle \psi ^k \mathrm{d}\rho (k)$.

We denote by $S:\mathcal {H}\rightarrow \mathcal {H}$ the operator

$$\begin{aligned} Sf:=\int _M\langle f,\psi ^k\rangle \psi ^k\mathrm{d}\rho (k),\ \ \forall f\in \mathcal {H}. \end{aligned}$$

Lemma 5.2

the operator S is linear and

1.
it is bounded and positive, with $\left\Vert S\right\Vert \le B$;
2.
it is invertible;
3.
the family $\{S^{-1}\psi ^k\}_{k\in M}$ is a continuous frame;
4.
for any $f\in \mathcal {H}$,
$$\begin{aligned} f=\int _M\langle f,\psi ^k\rangle S^{-1}\psi ^k\mathrm{d}\rho (k)=\int _M\langle f,S^{-1}\psi ^k\rangle \psi ^k\mathrm{d}\rho (k), \end{aligned}$$
where the equality is intended in the weak sense.

For a proof of this see [7, §5.8].

From now on, we suppose $M={\mathbb {R}}^2\times S^1\times {\mathbb {R}}^+$ with the measure $\rho $ such that $d\rho (k)=\frac{\mathrm{d}k}{\sigma ^2}$. In our case, the vectors $\psi ^k$ are the Gabor filters indexed by the points $k=(x,y,\theta ,\sigma )\in M$. Consider the usual scalar product $\langle , \rangle $ in $L^2({\mathbb {R}}^2)$, then in “Appendix B” we prove that there exists a constant $C_\psi \in {\mathbb {R}}^+$ such that for any pair of inputs $I,I'$ in $L^2({\mathbb {R}}^2)$

$$\begin{aligned} \int _M\langle I,\psi ^k\rangle \langle \psi ^k,I'\rangle \frac{\mathrm{d}k}{\sigma ^2} =C_\psi \cdot \langle I,I'\rangle . \end{aligned}$$

(5.2)

As a corollary,

$$\begin{aligned} SI=C_\psi \cdot I, \end{aligned}$$

(5.3)

where the equality as to be intended in the weak sense, that is $h_I=\langle C_\psi I,-\rangle _{L^2}$. In “Appendix C”, we prove that the equality (5.3) is also true in a much stronger sense.

We observe that $\langle I,\psi _{x,y,\theta ,\sigma }\rangle $ is exactly the output function $\mu $ associated to I, see Definition 2.1. Therefore by Lemma 5.2 and equality (5.3),

$$\begin{aligned}&I({\tilde{x}},{\tilde{y}})=\frac{1}{C_\psi }\cdot \nonumber \\&\int _M\mu (x,y,\theta ,\sigma )\cdot \psi ^k({\tilde{x}},{\tilde{y}}) \cdot \frac{1}{\sigma ^2}\mathrm{d}x\mathrm{d}y\mathrm{d}\theta \mathrm{d}\sigma .\end{aligned}$$

(5.4)

In the next section, starting from two input functions $I_0, I_1$ we produce a path $\mu _t$ of output functions. In order to produce a path in the input space from $\mu _t$, we define

$$\begin{aligned} I_t:=\frac{1}{C_\psi }\cdot \int _M\mu _t(k)\cdot \psi ^k \frac{\mathrm{d}k}{\sigma ^2}. \end{aligned}$$

(5.5)

6 Deformation of the Output

In this section, we show the existence of a path relying two output functions $\mu _0,\mu _1$. In particular, for any output we obtain two probability measures from the positive and negative part of the output. Using the results of Sects. 3 and 4, we build the paths relying the associated measure, and conclude from there.

We introduce a new condition on the input image $I:{\mathbb {R}}^2\rightarrow [0,1]$. If $\mu $ is the output associated to I, that is

$$\begin{aligned} \mu (k)= \langle I,\psi ^k\rangle . \end{aligned}$$

We look at $\mu $ as a function defined on the whole $X={\mathbb {R}}^2\times S^1\times {\mathbb {R}}$, but it is supported only on $M={\mathbb {R}}^2\times S^1\times {\mathbb {R}}^+\subset X$. We define the two functions ${\tilde{\mu }}^+:=\max (\mu ,0)$ and ${\tilde{\mu }}^-:=\max (-\mu ,0)$. We impose the condition

$$\begin{aligned} \int _X\mathrm{d}^2(k, 0){\tilde{\mu }}^+(k) \mathrm{d}k< \infty , \end{aligned}$$

(6.1)

and the same for ${\tilde{\mu }}^-$.

Lemma 6.1

If conditions (6.1) holds, then $\int \mathrm{d}^2(k,k_0){\tilde{\mu }}^+(k) \mathrm{d}k$ is finite for any $k_0\in X$.

Proof

We start by observing that $\int _X{\tilde{\mu }}^+(k) \mathrm{d}k<\infty $. Indeed, if we consider the compact $B=\{k|\ d(k,0)\le 1\}$ and the maximum b of ${\tilde{\mu }}^+$ over B, we have

$$\begin{aligned} \int _X{\tilde{\mu }}^+(k)\mathrm{d}k&\le {\tilde{\mu }}^+(B)\cdot b+\int _{k\notin B} {\tilde{\mu }}^+(k)\mathrm{d}k\\&\le {\tilde{\mu }}^+(B)\cdot b+\int _{k\notin B}\mathrm{d}^2(k,0){\tilde{\mu }}^+(k)\mathrm{d}k <\infty . \end{aligned}$$

For any $k_0\in X$, by the triangular inequality we have

$$\begin{aligned}&d^2(k,k_0)\\&\le d^2(k,0)+d^2(k_0,0)+2\cdot |d(k,k_0)\cdot d(k_0,0)|\ \ \forall k\in X. \end{aligned}$$

Therefore

$$\begin{aligned} \int _X \mathrm{d}^2(k,k_0){\tilde{\mu }}^+(k)\mathrm{d}k&\le \int _X\mathrm{d}^2(k,0){\tilde{\mu }}^+(k)\mathrm{d}k\nonumber \\&\quad +\mathrm{d}^2(k_0,0)\cdot \int _X{\tilde{\mu }}^+(k)\mathrm{d}k\\&\quad +2\cdot \int _X \mathrm{d}(k,0)\cdot \mathrm{d}(k_0,0){\tilde{\mu }}^+(k)\mathrm{d}k. \end{aligned}$$

The first two terms are clearly finite. Concerning the last one,

$$\begin{aligned}&\int _X \mathrm{d}(k,0)\cdot \mathrm{d}(k_0,0){\tilde{\mu }}^+(k)\mathrm{d}k\\&\le d(k_0,0)\cdot {\tilde{\mu }}^+(B)\nonumber \\&\quad +\int _{k\notin B} \mathrm{d}(k,0)\cdot \mathrm{d}(k_0,0){\tilde{\mu }}^+(k)\mathrm{d}k\\&\le d(k_0,0)\cdot {\tilde{\mu }}^+(B)+d(k_0,0)\cdot \\&\int _{k\notin B} \mathrm{d}(k,0)^2{\tilde{\mu }}^+(k)\mathrm{d}k<\infty , \end{aligned}$$

and this concludes the proof. $\square $

Remark 6.2

For $I:{\mathbb {R}}^2\rightarrow [0,1]$ we defined above ${\tilde{\mu }}^+$ and ${\tilde{\mu }}^-$. Consider the coefficient

$$\begin{aligned} m:=\int _M{\tilde{\mu }}^+(k)\mathrm{d}k=-\int _M{\tilde{\mu }}^-(k)\mathrm{d}k, \end{aligned}$$

which is well defined (see Remark 2.3) and finite as a consequence of condition (6.1) (see the proof of Lemma 6.1). We renormalize, $\mu ^+:={\tilde{\mu }}^+/m$ and $\mu ^-:={\tilde{\mu }}^-/m$.

Therefore for any function $\mu :M\rightarrow {\mathbb {R}}$ there exists two probability densities $\mu ^+$ and $\mu ^-$ such that

$$\begin{aligned} \mu =m\cdot (\mu ^+- \mu ^-), \end{aligned}$$

where m is a positive coefficient and $\mu ^+\cdot \mu ^-\equiv 0$. This means that $\mu ^+$ and $\mu ^-$ are the positive and negative part of $\mu $, renormalized in order to become probability densities.

Given two inputs $I_0,I_1$, we define respectively $\mu _i^+,\mu _i^-$ and $m_i$ for $i=0,1$. Using the equality (5.4), we obtain

$$\begin{aligned} I_0&=C_\psi ^{-1}\cdot m_0 \cdot \left( \int _M\mu _0^+(k)\psi ^k\frac{\mathrm{d}k}{\sigma ^2}-\int _M\mu _0^-(k)\psi ^k\frac{\mathrm{d}k}{\sigma ^2}\right) \\ I_1&=C_\psi ^{-1}\cdot m_1 \cdot \left( \int _M\mu _1^+(k)\psi ^k\frac{\mathrm{d}k}{\sigma ^2}-\int _M\mu _1^-(k)\psi ^k\frac{\mathrm{d}k}{\sigma ^2}\right) .\\ \end{aligned}$$

We consider the complete Riemannian variety $X={\mathbb {R}}^2\times S^1\times {\mathbb {R}}$ with the Lebesgue metric dk. In particular, we consider $\mu _i^+$ and $\mu _i^-$, for $i=0,1$, as measures on X even if they are defined over $M\subset X$. The function $\mu _i^\pm $ is identified with $\mu _i^\pm \cdot dk$ if $\sigma >0$ and with the null measure elsewhere. As a consequence of condition (6.1), $\mu _i^+,\mu _i^-\in {\mathcal {P}}_2(X)$ for $i=0,1$.

We consider the pairs $\mu _0^+,\mu _1^+$ and $\mu _0^-,\mu _1^-$, by Theorem 4.3 and Remark 4.4, there exists two constant speed geodesics $\mu _t^+$ and $\mu _t^-$.

Remark 6.3

In particular, by Theorem 3.15 there exists a transport map $T^+$ such that $\mu _1^+=T^+_\#\mu _0^+$ and the same for the negative part with a transport map $T^-$. Therefore, $\mu _t^+=\left( e_t^{(T^+)}\right) _\#\mu _0^+$ and $\mu _t^-=\left( e_t^{(T^-)}\right) _\#\mu _0^-$.

Remark 6.4

By definition of $e_t$ and of the transport map, the measures $\mu _t^\pm $ are null outside M for any $t\in [0,1]$, therefore we can always look at them as measures in ${\mathcal {P}}_2(M)$.

Furthermore, we point out that by construction $\mu _t^+$ and $\mu _t^-$ are absolutely continuous with respect to the Lebesgue measure dk. In the following, we will use the notation $\mu _t^+$ and $\mu _t^-$ indistinctly for the measures and for the density functions (defined over M) when there is no risk of confusion.

We consider a linear variation of the mass m, this means that we define a varying coefficient

$$\begin{aligned} m_t:=m_0(1-t)+m_1t\ \ \forall t\in [0,1] \end{aligned}$$

(6.2)

We define the path of output functions using these coefficients,

$$\begin{aligned} \mu _t:=m_t\cdot \left( \left( e_t^{(T^+)}\right) _\#\mu _0^+-\left( e_t^{(T^-)}\right) _\#\mu _0^-\right) . \end{aligned}$$

(6.3)

Finally, from Eq. (5.5) we obtain a path of input functions from $I_0$ to $I_1$.

7 Constraining the Output

In this section, we introduce a useful tool to describe the geodesic $\mu _t$, the so-called weak Riemannian structure of $({\mathcal {P}}_2(X), W_2)$, the space of probabilities endowed with the Wassertein distance.

If $\mu _t$ is an absolutely continuous curve in ${\mathcal {P}}_2(X)$ (with respect to the Wassertein distance), consider a time dependent vector field $v_t$ on TX such that the following continuity equation is verified in the sense of distributions,

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}\mu _t+\nabla \cdot (v_t\mu _t)=0. \end{aligned}$$

(7.1)

For the proof of the following theorem, we refer again to [3].

Theorem 7.1

(see [3, Theorem 2.29]) If X is a smooth complete Riemannian manifold without boundary, then

1.
for every absolutely continuous curve $\mu _t\in {\mathcal {P}}_2(X)$ there exists a Borel family of vector fields $v_t$ such that $\left\Vert v_t\right\Vert _{L^2(\mu _t)}\le |{\dot{\mu }}_t|$ for a.e. t and the continuity Eq. (7.1) is satisfied (in the sense of distributions);
2.
if $(\mu _t,v_t)$ satisfies (7.1) and $\int _0^1\left\Vert v_t\right\Vert _{L^2(\mu _t)}\mathrm{d}t$ is finite, then $\mu _t$ is an absolutely continuous curve (up to a negligible set of points) and $|{\dot{\mu }}_t|\le \left\Vert v_t\right\Vert _{L^2(\mu _t)}$ for a.e. $t\in [0,1]$.

Remark 7.2

We recall the Benamou–Brenier formula proved for example at [3, Proposition 2.30] and stating that the minimization problem solved by a geodesic relying $\mu _0,\mu _1\in {\mathcal {P}}_2(X)$ can be reformulated in terms of the vector field $v_t$. In particular, we have

$$\begin{aligned} W_2(\mu _0,\mu _1)=\inf \int _0^1\left\Vert v_t\right\Vert _{L^2(\mu _t)}\mathrm{d}t, \end{aligned}$$

where the infimum is taken among all weakly continuous distributional solutions of the continuity equation for $(\mu _t,v_t)$.

As a direct consequence of Theorem 7.1, for every absolutely continuous curve $\mu _t$ in ${\mathcal {P}}_2(X)$, there exists a family of vector fields $(v_t)$ verifying the continuity equation and such that $\left\Vert v_t\right\Vert _{L^2(\mu _t)}=|{\dot{\mu }}_t|$ for a.e. t. This family is not unique in general, but it is unique if we define as follows the tangent space to ${\mathcal {P}}_2(X)$ where the vector fields must live in.

Definition 7.3

If $\mu \in {\mathcal {P}}_2(X)$, then the tangent space to ${\mathcal {P}}_2(X)$ is defined as

$$\begin{aligned} T_\mu {\mathcal {P}}_2(X)&:={\overline{\left\{ \nabla \varphi :\ \varphi \in {\mathcal {C}}^\infty _c(X)\right\} }}^{L^2(\mu )}\\&=\left\{ v\in L^2(\mu ):\ \int \langle v,w\rangle \mathrm{d}\mu =0,\ \right. \\&\left. \forall w\in L^2(\mu )\ \text{ s.t. }\ \nabla \cdot (w\mu )=0\right\} . \end{aligned}$$

Therefore for any absolutely continuous curve $\mu _t$ in ${\mathcal {P}}_2(X)$, we have an associated vector field $v_t$. In particular, we have it for the geodesics $\mu _t^+$ and $\mu _t^-$ obtained via two inputs $I_0,I_1$ (see Remark 6.3). Observe that both $\mu _t^+$ and $\mu _t^-$ are constant speed geodesics in $({\mathcal {P}}_2(X),W_2)$ therefore they are absolutely continuous curves, and so Theorem 7.1 applies to them.

We denote by $v_t^+$ and $v_t^-$ the vector fields associated respectively to $\mu _t^+$ and $\mu _t^-$. Moreover, we define the normalized image

$$\begin{aligned} J_t:=\frac{I_t}{m_t}, \end{aligned}$$

so that

$$\begin{aligned} J_t=\int _M\frac{\mu _t(k)}{m_t}\psi ^k\frac{\mathrm{d}k}{\sigma ^2}=\int _M(\mu ^+_t-\mu ^-_t)\psi ^k\frac{\mathrm{d}k}{\sigma ^2}. \end{aligned}$$

(7.2)

We are interested in the existence of a family $J_t$ of inputs ${\mathbb {R}}^2\rightarrow [0,1]$ that relies $J_0$ to $J_1$, such that $J_t$ is in ${\mathcal {C}}^1([0,1]; L^2({\mathbb {R}}^2))$ and $\frac{\mu _t(k)}{m_t}=\langle J_t,\psi ^k\rangle $ for any $k\in M$.

In order to find such a path, we observe that if it exists, then

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}\frac{\mu _t(k)}{m_t}=\left\langle \frac{\mathrm{d}J_t}{\mathrm{d}t},\psi ^k\right\rangle \ \ \forall k\in M\subset X. \end{aligned}$$

From the continuity equation, we know that

$$\begin{aligned} \frac{\mathrm{d}\mu _t^+}{\mathrm{d}t}&=-\nabla \cdot (v_t^+\mu ^+_t)\\ \frac{\mathrm{d}\mu _t^-}{\mathrm{d}t}&=-\nabla \cdot (v_t^-\mu ^-_t). \end{aligned}$$

We define $v_t:=v_t^+-v_t^-$, and therefore

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}\frac{\mu _t}{m_t}=-\nabla \cdot \left( v_t\frac{\mu _t}{m_t}\right) . \end{aligned}$$

Indeed, by Theorem 7.1, $v_t^+\in L^2(\mu _t^+)$ and $v_t^-\in L^2(\mu _t^-)$. Therefore it is possible to extend both vector fields to the whole M in such a way that $v_t^+\cdot \mu _t^-\equiv 0$ and $v_t^-\cdot \mu _t^+\equiv 0$. Then we have,

$$\begin{aligned}&\frac{\mathrm{d}}{\mathrm{d}t}\frac{\mu _t(k)}{m_t}\\&=-\nabla \cdot \left( v_t\langle J_t,\psi ^k\rangle \right) \nonumber \\&=-\left\langle J_t,v_t(k)\cdot \nabla \psi ^k+\psi ^k\cdot (\nabla \cdot v_t(k))\right\rangle \ \ \ \forall k\in M\subset X. \end{aligned}$$

For an opportune vector field $\alpha \in TM$, we have

$$\begin{aligned} \nabla \psi ^k=\psi ^k\cdot \alpha (k). \end{aligned}$$

(7.3)

In particular, if we use the notation $({\tilde{x}}_k,{\tilde{y}}_k)=\sigma ^{-1}R_{-\theta }({\tilde{x}}-x,{\tilde{y}}-y)$, where $k=(x,y,\theta ,\sigma )\in M$, it is straightforward to verify that

$$\begin{aligned} \alpha ^x&=2 \left( \sigma ^{-2}({\tilde{x}}-x)+\frac{\sigma ^{-1}\sin \theta }{\tan (2{\tilde{y}}_k)}\right) \\ \alpha ^y&= 2\left( \sigma ^{-2}({\tilde{y}}-y)-\frac{\sigma ^{-1}\cos \theta }{\tan (2{\tilde{y}}_k)}\right) \\ \alpha ^\theta&= \frac{2\sigma ^{-1}}{\tan (2{\tilde{y}}_k)}\left( \cos \theta ({\tilde{x}}-x)+\sin \theta ({\tilde{y}}-y)\right) \\ \alpha ^\sigma&= 2 \left( \sigma ^{-3}\left\Vert {\tilde{v}}-v\right\Vert ^2\right. \\&\quad \left. +\frac{\sigma ^{-2}(\sin \theta ({\tilde{x}}-x)-\cos \theta ({\tilde{y}}-y))}{\tan (2{\tilde{y}}_k)}- \frac{3\sigma ^{-1}}{4}\right) . \end{aligned}$$

In particular for any k, the vector field $\alpha $ is well defined almost everywhere on ${\mathbb {R}}^2$.

Therefore for a.e. $t\in [0,1]$ and every $k\in M$,

$$\begin{aligned} \left\langle \frac{\mathrm{d}J_t}{\mathrm{d}t},\psi ^k\right\rangle =-\left\langle J_t(\alpha \cdot v_t+\nabla \cdot v_t),\psi ^k\right\rangle . \end{aligned}$$

Thus, if $\alpha \cdot v_t+\nabla \cdot v_t$ is independent of the variable k as a function ${\mathbb {R}}^2\rightarrow {\mathbb {R}}$, as $\left( \psi ^k\right) _k$ is a frame, this implies

$$\begin{aligned} \frac{\mathrm{d}J_t}{\mathrm{d}t}=-J_t(\alpha \cdot v_t+\nabla \cdot v_t). \end{aligned}$$

(7.4)

This last equality is true in the weak sense but also in the (stronger) sense showed in “Appendix C”.

In order to state our last theorem, we introduce two additional conditions. First we impose that the inputs $I_0,I_1$ are null outside a compact subset of ${\mathbb {R}}^2$. This simply says that the images we are treating are limited in space.

We also impose that the vector fields take their values in a Sobolev space.

Theorem 7.4

Consider two inputs $I_0,I_1:{\mathbb {R}}^2\rightarrow [0,1]$ in $L^2({\mathbb {R}}^2)$ null outside a compact subset of ${\mathbb {R}}^2$, their associated output functions $\mu _0,\mu _1$, the absolutely continuous curve $\mu _t$ relying $\mu _0$ to $\mu _1$ defined in (6.3) and the associated (unique) vector field $v_t$. Moreover, take the vector field $\alpha $ defined in (7.3).

We suppose that

$$\begin{aligned} v\in L^1([0,1]; W^{1,\infty }(X,{\mathbb {R}}^4)). \end{aligned}$$

If for any $t\in [0,1]$ the following equality is verified

$$\begin{aligned} \nabla \left( \alpha \cdot v_t+\nabla \cdot v_t\right) =0, \end{aligned}$$

(7.5)

then there exists a path $I_t\in {\mathcal {C}}^1([0,1];L^2({\mathbb {R}}^2))$ relying $I_0$ to $I_1$ such that $\mu _t(k)=\langle I_t,\psi ^k\rangle $ for any $k\in M$.

Proof

We use the notation $u_t:=\alpha \cdot v_t+\nabla \cdot v_t$ and observe that by (7.5), $u_t$ is independent of the point k. As a consequence of this independence, $u_t$ can be defined also where $\alpha $ is singular. We consider the differential equation

$$\begin{aligned} \frac{\mathrm{d}J_t}{\mathrm{d}t}=-u_t\cdot J_t, \end{aligned}$$

and the coefficient $m_t$ as defined in (6.2). Given the initial function $J_0:=I_0/m_0$, the solution to the equation above is

$$\begin{aligned} J_t:=J_0\cdot e^{-h_t}, \end{aligned}$$

where $h_t:=\int _0^tu_s\mathrm{d}s\ \forall t\in [0,1]$ is a primitive of $u_t$, and it exists as a consequence of $v_t\in W^{1,\infty }$. The function $J_0$ is null outside a compact subset $K\subset {\mathbb {R}}^2$, and $h_t$ is continuous therefore limited over K. This implies $J_t\in L^2({\mathbb {R}}^2)$ for any $t\in [0,1]$.

We define $I_t:=m_t\cdot J_t$ and

$$\begin{aligned} \nu _t(k):=m_t\cdot \langle J_t,\psi ^k\rangle =\langle I_t,\psi ^k\rangle . \end{aligned}$$

By construction, for any $t\in [0,1]$ $\nu _t$ satisfies a.e. the continuity Eq. (7.1) with respect to the vector field $v_t$. Moreover, $\nu _0=\frac{\mu _0}{m_0}=\mu ^+_0-\mu ^-_0$. The solution to the continuity equation under this conditions is unique for absolutely continuous measures (see [1]). Therefore $\nu _t=\frac{\mu _t}{m_t}$ a.e. for a.e. $t\in [0,1]$. We observe that this also proves that $m_1J_1=I_1$. $\square $

8 Discrete Model

In order to produce an implementation of a transport model in the space $M={\mathbb {R}}^2\times S^1\times {\mathbb {R}}^+$, we have to work in a discrete setting, and therefore it is not possible to use the frame $\psi ^k$ introduced above, because it is in fact a frame only when the index k varies along all M.

We consider instead a discrete frame that produce a new set of output functions. The frame we consider is the so-called Gabor Wavelet Pyramid (see for instance [20]) that relies again in the Gabor complex mother function

$$\begin{aligned} e^{-{\tilde{x}}^2-{\tilde{y}}^2}\cdot e^{2\pi i\omega {\tilde{y}}}, \end{aligned}$$

but we split it in its real and complex component, defining two sets of filter functions associated to the two following mother functions

$$\begin{aligned} \psi _e({\tilde{x}},{\tilde{y}})&:=e^{-{\tilde{x}}^2-\gamma \cdot {\tilde{y}}^2}\cdot \cos (2\pi \omega {\tilde{y}})\\ \psi _o({\tilde{x}},{\tilde{y}})&:=e^{-{\tilde{x}}^2-\gamma \cdot {\tilde{y}}^2}\cdot \sin (2\pi \omega {\tilde{y}}). \end{aligned}$$

As we have seen, in the case of the continuous frame, the elements of the frame are obtained via the action of the Heisenberg group on the mother function. In the case of wavelets, the acting group is the affine group. We recall that by $\psi _{e,\theta }$ and $\psi _{o,\theta }$ we mean the same functions with the rotation by $\theta $ applied, that is

$$\begin{aligned} \psi _{e,\theta }({\tilde{x}},{\tilde{y}})&=\psi _e\left( R_{-\theta }({\tilde{x}},{\tilde{y}})\right) \\&=\psi _e\left( \cos (\theta ){\tilde{x}}+\sin (\theta ){\tilde{y}},\ -\sin (\theta ){\tilde{x}}\right. \\&\quad \left. +\cos (\theta ){\tilde{y}}\right) , \end{aligned}$$

and analogously for $\psi _o$.

Definition 8.1

Given two real numbers $a_0,b_0$ and a positive integer number d, we consider $\theta _0:=\pi /d$ and $\theta _\ell =\ell \cdot \theta _0$ for any $\ell $ positive integer. Then we can define the wavelets associated to $\psi _e$ and $\psi _o$. For any $n,k,\ell ,j\in {\mathbb {Z}}_{\ge 0}$,

$$\begin{aligned} \psi _e^{n,k,\ell ,j}({\tilde{x}},{\tilde{y}})&=\frac{1}{a_0^j}\cdot \psi _{e,\theta _\ell }\left( \frac{{\tilde{x}}}{a_0^j}-nb_0,\ \frac{{\tilde{y}}}{a_0^j}-kb_0\right) \\ \psi _o^{n,k,\ell ,j}({\tilde{x}},{\tilde{y}})&=\frac{1}{a_0^j}\cdot \psi _{o,\theta _\ell }\left( \frac{{\tilde{x}}}{a_0^j}-nb_0,\ \frac{{\tilde{y}}}{a_0^j}-kb_0\right) \end{aligned}$$

As developed in [10] by Daubechies for one variable wavelets, and generalized to two variable wavelets by Lee in [21], for an opportune choice of $a_0,b_0\in {\mathbb {R}}$, and $n,k,\ell ,j\in {\mathbb {Z}}_{\ge 0}$, the wavelets above form a frame, that is there exist real numbers $A,B>0$ such that for any $f\in L^2({\mathbb {R}})$ we have

$$\begin{aligned}&A\cdot \left\Vert f\right\Vert ^2\le \sum _{n,k,\ell ,j}\left| \langle f,\psi _e^{n,k,\ell ,j}\rangle \right| ^2+\sum _{n,k,\ell ,j}\left| \langle f,\psi _o^{n,k,\ell ,j}\rangle \right| ^2\\&\le B\cdot \left\Vert f\right\Vert ^2, \end{aligned}$$

where the scalar product is the classical

$$\begin{aligned} \langle f,\psi \rangle =\int f({\tilde{x}},{\tilde{y}})\cdot \psi ({\tilde{x}},{\tilde{y}})\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}. \end{aligned}$$

As a consequence, it is possible to reconstruct the function as

$$\begin{aligned} f= & {} C\cdot \left( \sum _{n,k,\ell ,j}\langle f,\psi _e^{n,k,\ell ,j}\rangle \cdot \psi _e^{n,k,\ell ,j}\right. \nonumber \\&\left. +\sum _{n,k,\ell ,j}\langle f,\psi _o^{n,k,\ell ,j}\rangle \cdot \psi _o^{n,k,\ell ,j}\right) , \end{aligned}$$

(8.1)

where C is a constant and the equality is true in the weak sense. Observe that the splitting of the mother function in its even and odd part allows to consider only ‘half’ of the $S^1$ circle, or equivalently it allows to use an equipartition of the space of directions $S^1/\{\pm 1\}$.

In our setting, we consider in this case four output functions obtained by convoluting any input $I:{\mathbb {R}}^2\rightarrow [0,1]$ with the frame above. Then, we apply the same procedure of Sect. 6, that is we use optimal transport between functions on M to build a path relying the initial output to the final output.

Instead of working over the whole M, we consider a compact subset

$$\begin{aligned} M_c:=[0,D]^2\times S^1\times [\sigma _{\min },\sigma _{\max }], \end{aligned}$$

where $[0,D]^2$ represents the portion of the plane which is registered on the retina, while $[\sigma _{\min }, \sigma _{\max }]$ is the interval where the scale parameter $\sigma $ varies.

Remark 8.2

The idea behind the Gabor Wavelet Pyramid is that we define the output functions indexed by a discrete subset of $M_c$. We use the notation $\llbracket 0,N\rrbracket $ for the subset $\{0,1,\dots ,N\}\subset {\mathbb {Z}}_{\ge 0}$ for any positive integer N. We set $a_0=\sigma _{\min }$ and $j_\star =\log _{a_{0}}(\sigma _{\max })$. For any $j\in \llbracket 1,j_{\star }\rrbracket $, we consider the discrete subset

$$\begin{aligned}&(b_0a_0^j)\cdot \llbracket 0,\left\lfloor \frac{D}{b_0a_0^j}\right\rfloor \rrbracket ^2\times \theta _0\cdot [1,d]\times \left\{ a_0^j\right\} \\&\subset [0,D]^2\times S^1\times \left\{ a_0^j\right\} , \end{aligned}$$

thus defining different strata of a discrete subset of the whole $M_c$.

Remark 8.3

We observe that any wavelet $\psi _e^{n,k,\ell ,j}$ (the same is true for $\psi _o$) is centred at $(nb_0a_0^j,\ kb_0a_0^j)$. Furthermore, given two inputs $I_0,I_1$, we can define the following functions on $M_c$, for $i=0,1$,

$$\begin{aligned} {\tilde{\mu }}_{i,e}^+(nb_0a_0^j,kb_0a_0^j,\theta _\ell ,a_0^j)&=\max \left( \langle I_i,\psi _e^{n,k,\ell ,j}\rangle ,0\right) \\ {\tilde{\mu }}_{i,e}^-(nb_0a_0^j,kb_0a_0^j,\theta _\ell ,a_0^j)&=-\min \left( \langle I_i,\psi _e^{n,k,\ell ,j}\rangle ,0\right) \end{aligned}$$

and analogously for ${\tilde{\mu }}_{i,o}^+$ and ${\tilde{\mu }}_{i,o}^-$. We normalize these functions in order to obtain the probabilities $\mu _{i,e}^{\pm }$ for $i=0,1$ (and analogously for $\mu _{i,o}^{\pm }$).

In order to build the transport path $\mu _t$ for any of the pairs of output functions above, we focus on the distance over $M_c$. In fact, instead of working with the distance treated in the previous sections, another distance turns out to be more efficient in our setting, and at the same time it is a distance equivalent to the previous one.

Definition 8.4

Two distances $d_1,d_2$ over the space $\Omega $ are said to be equivalent if for any compact set $K\subset \Omega $, there exists a constant C such that

$$\begin{aligned} \frac{1}{C}\cdot d_2(p_1,p_2)\le d_1(p_1,p_2)\le C\cdot d_2(p_1,p_2)\ \ \forall p_1,p_2\in \Omega . \end{aligned}$$

They are locally equivalent if for any $p_0\in \Omega $ there exists a neighbourhood U of $p_0$ such that $d_1$ and $d_2$ are equivalent over U.

The main reference for the equivalence result we are going to use is [22]. With our notation (see Remark 2.2), we consider in every point the metric

$$\begin{aligned} {\tilde{g}}(x,y,\theta ,\sigma ):= \left( \begin{array}{cccc} 1 &{}0 &{}0 &{}0\\ 0 &{}1 &{}0&{}0\\ 0 &{}0 &{}\frac{h_1^2}{\sigma ^2} &{}0\\ 0&{} 0&{} 0&{} \frac{h_2^2}{\sigma ^2}\\ \end{array}\right) \end{aligned}$$

and we denote by $Y_1,Y_2,Y_3,Y_4$ the orthonormal basis at every point defined by $Y_1=X_1$ and $Y_2=X_2$, while $Y_3=\frac{\sigma X_3}{h_1}$ and $Y_4=\frac{\sigma X_4}{h_2}$. As we are interested in constant coefficient flows, consider

$$\begin{aligned} Y=c_1Y_1+c_2Y_2+c_3Y_3+c_4Y_4, \end{aligned}$$

with $c_i\in {\mathbb {R}}^+$ for $i=1,\dots ,4$. In order to evaluate the distance $d_c$ between two points $p_0=(x_0,y_0,\theta _0,\sigma _0)$ and $p_1=(x_1,y_1,\theta _1,\sigma _1)$, we consider the constant coefficient vector field Y that induces a curve $p_t:[0,1]\rightarrow M$ relying them. We develop these evaluations in “Appendix D”.

Proposition 8.5

[See, [22, Theorem 2]] The distance induced by the Riemannian metric ${\tilde{g}}$ and the distance $d_c$ are locally equivalent.

In the next section, we are going to show an implementation of our transport model based on the distance $d_c$.

Remark 8.6

From the previous results we know that there exists four unique paths $\mu _{t,e}^{\pm },\ \mu _{t,o}^{\pm }$ relying the associated probabilities. Using the fact that the chosen set of filters is a frame, we reconstruct the intermediary images, by defining

$$\begin{aligned}&I_t:\\&=\sum _{n,k,\ell ,j}m_{e,t}(\mu _{t,e}^+-\mu _{t,e}^-)(nb_0a_0,kb_0a_0,\theta _\ell ,a_0^j)\cdot \psi _e^{n,k,\ell ,j}\\&+\sum _{n,k,\ell ,j}m_{o,t}(\mu _{t,o}^+-\mu _{t,o}^-)(nb_0a_0,kb_0a_0,\theta _\ell ,a_0^j)\cdot \psi _{o}^{n, k,\ell ,m}, \end{aligned}$$

where the coefficients $m_{e,t}$ and $m_{o,t}$ depend on the masses of ${\tilde{\mu }}_{i,e}^\pm $ and ${\tilde{\mu }}_{i,o}^\pm $ for $i=0,1$.

9 Implementation

In order to implement our model, we code a Gabor Wavelet Pyramid as sketched in the previous section, and evaluate the transport maps using a Sinkhorn’s algorithm of the kind treated by Peyré and Cuturi [25]. In discrete setting, the transport plan between two functions $u_1,u_2$, is a matrix P. In particular, if the two functions are represented as vectors of dimension m, the matrix P has dimension $m\times m$ and verifies the property

$$\begin{aligned} P\varvec{1}_m=u_1,\ \ P^T\varvec{1}_m=u_2 \end{aligned}$$

(9.1)

where $\varvec{1}_m$ is the vector of length m whose coordinates are all 1, and therefore (9.1) means that $P\in {{\,\mathrm{Adm}\,}}(u_1,u_2)$.

We recall that the transport plan P between two functions $u_1,u_2$ minimizes the product $\langle P,C\rangle $ where C is the cost matrix associated to our setting, and P respects condition (9.1). In the case of the Sinkhorn’s algorithm, with regularization coefficient $\varepsilon $, the minimized quantity is

$$\begin{aligned} \langle P,C \rangle -\varepsilon H(P), \end{aligned}$$

(9.2)

where H(P) is the entropy of P. The uniqueness of the solution is in fact true for any strictly concave function H (see [12] for a wider analysis on entropy regularizations). A well-known result [25, §4] states that if $L_C(u_1,u_2)$ is the minimum reached by $\langle P,C\rangle $ under condition (9.1), and $L^\varepsilon _C(u_1,u_2)$ is the minimum of (9.2) under the same condition, then

$$\begin{aligned} L^\varepsilon _C(u_1,u_2)\xrightarrow {\varepsilon \rightarrow 0} L_C(u_1,u_2). \end{aligned}$$

We observe that the plan achieving $L_C(u_1,u_2)$ is not always unique in the discrete setting, while the one achieving $L_C^\varepsilon (u_1,u_2)$ is in fact unique as proved for example by [25, Proposition 4.3]. Moreover, if $P^\varepsilon $ is the optimal plan for $L^\varepsilon _C$, then $P^\varepsilon \rightarrow P^\star $ where $P^\star $ is the maximum entropy plan among those achieving the optimum $L_C$.

In order to compute the geodesic between $\mu _{0,e}^+$ and $\mu _{1,e}^+$, we define the optimal plan for this case

$$\begin{aligned} P^+_e:={{\,\mathrm{argmin}\,}}_{P\in {{\,\mathrm{Adm}\,}}(\mu _{0,e}^+,\mu _{1,e}^+)} \left( \langle P,C\rangle -\varepsilon \cdot H(P)\right) . \end{aligned}$$

and consider the meshgrid of Remark 8.2. The pseudocode for the computation is the following.

We do the analogously for $\mu _{t,e}^-$ and $\mu _{t,o}^\pm $. In order to develop the code, we implemented various algorithms developed in [25], in particular we used part of the code developed by the same authors for the Sinkhorn’s algorithm and disposable online [24].

In our simulation, we set $a_0=2, b_0=1$ and as already said $D=32$ or 64. We also set $\ell =8$ which is coherent with previous results on the orientation sampling in primates (see [5, 27]). Furthermore, we set $\sigma _{\min }=1,1244$ and $\sigma _{\max }=\sigma _{\min }\cdot D$. The pseudocode for the complete simulation is the following.

The images we consider are simple shapes of the letters ‘T’ and ‘E’, the second ones rotated of about $\frac{\pi }{4}$ counterclockwise. We show the implementation for this input in order to emphasize the ability of our model to reconstruct rotational displacement. We also add an hammer-type shape, in order to show that the implementation works well also in the case of a multi-scale object, that is shapes with different thickness along the figure are also well preserved (Figs. 1, 2, 3, 4, 5).

Furthermore, we compare this numerical cortical-style implementation with a classical 2-dimensional planar regularized optimal transport implementation, following the general theory treated for example in [25, §3], and evaluating the transport matrix again via the Sinkhorn’s algorithm. In this case, we take the two inputs $I_0,I_1$ and the probabilities $\nu _0,\nu _1\in {\mathcal {P}}(\llbracket 1,D\rrbracket )$ obtained by normalizing the inputs. The cost function is the quadratic cost $c:\llbracket 1,D\rrbracket ^2\rightarrow {\mathbb {R}}^+$ such that $c(i,j)=(i-j)^2$.

We point out that we applied a smoothing threshold in order to reduce the blur due to the entropic regularization in both our implementations. In particular, we passed the pictures through the sigmoid

$$\begin{aligned} \Sigma (z)=\frac{1}{1+e^{-k\cdot (z-z_0)}} \end{aligned}$$

where $y\in [0,1]$ is the pixel intensity and we choose $k=30$ and $z_0=0.65$.

In these reconstructed images obtained with the cortical model, we observe the conservation of the image structure along the rotational movement, while in the standard 2-d optimal transport the basic shape is lost. Therefore, lifting the input via Gabor filters and moving the output function through optimal transport tools seems to allow an effective image deformation preserving the fundamental aspects of rigid rotational motion.

If we consider a rigid translation, our procedure works well in the case of simple shapes that are moved along their principal direction, as it is the case for the ‘I’ shape in Fig. 6.

10 Conclusions and Future Developments

We formulated the general theory of a lifting of retinal inputs in a 4-dimensional cortical space, and of the time completion between two cortical outputs $\mu _0,\mu _1$ (corresponding to inputs $I_0,I_1$ in the retinal plane). We did the lifting through Gabor filters, and we considered the frame conditions that allow to project cortical measures back to the retinal space, thus obtaining from the completion paths $\mu _t^{\pm }$ on the cortical space, a retinal path $I_t$ between the original inputs.

We obtained the cortical paths via methods of optimal transport, where the cost function is the squared distance on the cortical space. We implemented these tools using a Sinkhorn’s algorithm on a discretized version of the cortical manifold, and via a Gabor Wavelet Pyramid system we also implemented the retinal path $I_t$. We tested positively our model on rigid rotational movements of multi-scale shapes, verifying the shape conservation.

The Sinkhorn’s algorithm does not minimize the original cost $\langle c,\gamma \rangle $, but a regularized version of it with an entropic correction $\langle c,\gamma \rangle -\varepsilon \cdot h(\gamma )$. If $\gamma ^\varepsilon $ is a solution to the regularized problem, we know that $\gamma ^\varepsilon \xrightarrow {\varepsilon \rightarrow 0}\gamma ^\star $ where $\gamma ^\star $ is a solution of the original problem. The difference between $\gamma ^\varepsilon $ and $\gamma ^\star $ is what induces a blurring on the resulting moving shapes in our implementation. We de-blurred the images by filtering them through a sigmoid function.

Knowing that $\left\Vert \gamma ^\varepsilon -\gamma ^\star \right\Vert $ is controlled by $\varepsilon $, in future works we intend to follow the Rigollet-Weed model [26] and use another optimal transport technique to do the de-blurring. Indeed, we can suppose that the noise factor $\sigma ^2$ in [26] depends directly on $\varepsilon $, and from there we can develop the Wasserstein distance minimization as conceived in the work above.

In future, we also intend to improve the representation of translational movements. In our actual implementations, translation works correctly particularly along the boundary directions. In order to extend our model by considering spatio-temporal Gabor filtering, as done for example in [4], we could improve translation in the direction orthogonal to boundaries, still implementing the optimal transport tools considered in this work and thus preserving the boundary shapes of the retinal inputs.

Anyway, the problem with displacements orthogonal to the object boundaries is linked also to the structure of the Gabor Wavelet Pyramid: in order to optimize the sampling, the pyramid frame is usually built by considering $\sigma $ values that are “far” from 0, this emphasizes the boundary constrains. We will work in frame buildings that allow for different $\sigma $ ranges.

Data availability and Code availability

The datasets and codes generated during and analysed during the current study are available from the corresponding author on reasonable request.

References

Ambrosio, L., Crippa, G.: Existence, Uniqueness, Stability and Differentiability Properties of the Flow Associated to Weakly Differentiable Vector Fields, pp. 3–57. Springer, Berlin (2008)
MATH Google Scholar
Ambrosio, L., Gigli, N.: Construction of the parallel transport in the wasserstein space. Methods Appl. Anal. 15(1), 1–30 (2008)
Article MathSciNet MATH Google Scholar
Ambrosio, L., Gigli, N.: A User’s Guide to Optimal Transport, pp. 1–155. Springer, Berlin (2013)
Google Scholar
Barbieri, D., Citti, G., Cocci, G., Sarti, A.: A cortical-inspired geometry for contour perception and motion integration. J. Math. Imaging Vis. 49(3), 511–529 (2014)
Article MATH Google Scholar
Barbieri, D., Citti, G., Sarti, A.: How uncertainty bounds the shape index of simple cells. J. Math. Neurosci. 4(1), 1–15 (2014)
Article MathSciNet MATH Google Scholar
Baspinar, E., Sarti, A., Citti, G.: A sub-riemannian model of the visual cortex with frequency and phase. Preprint arXiv:1910.04992 (2019)
Christensen, O.: An Introduction to Frames and Riesz Bases, Applied and Numerical Harmonic Analysis. Birkhäuser, Boston (2002)
Google Scholar
Citti, G., Sarti, A.: A cortical based model of perceptual completion in the roto-translation space. J. Math. Imaging Vis. 24(3), 307–326 (2006)
Article MathSciNet MATH Google Scholar
Citti, G., Sarti, A.: Neuromathematics of Vision, vol. 32. Springer (2014)
Daubechies, I.: The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 36(5), 961–1005 (1990)
Article MathSciNet MATH Google Scholar
Daugman, J.G.: Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A 2(7), 1160–1169 (1985)
Article Google Scholar
Di Marino, S., Gerolin, A.: Optimal transport losses and sinkhorn algorithm with general convex regularization. Preprint arXiv:2007.00976 (2020)
Favali, M., Abbasi-Sureshjani, S., ter Haar Romeny, B., Sarti, A.: Analysis of vessel connectivities in retinal images by cortically inspired spectral clustering. J. Math. Imaging Vis. 56(1), 158–172 (2016)
Article MathSciNet MATH Google Scholar
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177(2), 113–161 (1996)
Article MathSciNet MATH Google Scholar
Hoffman, W.C.: The lie algebra of visual perception. J. Math. Psychol. 3(1), 65–98 (1966)
Article MathSciNet MATH Google Scholar
Hoffman, W.C.: The visual cortex is a contact bundle. Appl. Math. Comput. 32(2–3), 137–167 (1989)
MathSciNet MATH Google Scholar
Hubel, D.H.: Eye, Brain and Vision, Volume 22 of Scientific American Library. Scientific American Press, New York (1988)
Google Scholar
Hubel, D.H., Wiesel, T.N.: Ferrier lecture-functional architecture of macaque monkey visual cortex. Proc. R. Soc. Lond. Ser. B. Biol. Sci. 198(1130), 1–59 (1977)
Google Scholar
Kanizsa, G.: Grammatica del vedere. saggi su percezione e gestalt. Il Mulino, Bologna (1980)
Google Scholar
Kay, K.N., Naselaris, T., Prenger, R.J., Gallant, J.L.: Identifying natural images from human brain activity. Nature 452(7185), 352–355 (2008)
Article Google Scholar
Lee, T.S.: Image representation using 2d gabor wavelets. IEEE Trans. Pattern Anal. Mach. Intell. 18(10), 959–971 (1996)
Article Google Scholar
Nagel, A., Stein, E.M., Wainger, S., et al.: Balls and metrics defined by vector fields i: basic properties. Acta Math. 155, 103–147 (1985)
Article MathSciNet MATH Google Scholar
Petitot, J., Tondut, Y.: Vers une neurogéométrie. fibrations corticales, structures de contact et contours subjectifs modaux. Mathématiques et sciences humaines 145, 5–101 (1999)
Google Scholar
Peyré, G., Cuturi, M.: Computational optimal transport, web site of the book “computational optimal transport”. https://github.com/optimaltransport. Accessed 21 June 2021
Peyré, G., Cuturi, M.: Computational Optimal Transport: With Applications to Data Science. Foundations and trends in machine learning. Now, the essence of knowledge (2019)
Rigollet, P., Weed, J.: Entropic optimal transport is maximum-likelihood deconvolution. Comptes Rendus Mathematique 356(11–12), 1228–1235 (2018)
Article MathSciNet MATH Google Scholar
Ringach, D.L: Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. J. Neurophysiol. (2002)
Sanguinetti, G., Citti, G., Sarti, A.: A model of natural image edge co-occurrence in the rototranslation group. J. Vis. 10(14), 37–37 (2010)
Article Google Scholar
Sarti, A., Citti, G., Petitot, J.: The symplectic structure of the primary visual cortex. Biol. Cybern. 98(1), 33–48 (2008)
Article MathSciNet MATH Google Scholar
Sarti, A., Citti, G., Piotrowski, D.: Differential heterogenesis and the emergence of semiotic function. Semiotica 2019(230), 1–34 (2019)
Article Google Scholar
Villani, C.: Optimal Transport: Old and New, Grundlehren der mathematischen Wissenschaften. Springer, Berlin (2008)
Google Scholar
Wexler, Y., Shechtman, E., Irani, M.: Space-time video completion. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 1, pp. I–I. IEEE (2004)
Zhu, L., Yang, Y., Haker, S., Tannenbaum, A.: An image morphing technique based on optimal mass preserving mapping. IEEE Trans. Image Process. 16(6), 1481–1495 (2007)
Article MathSciNet Google Scholar

Download references

Funding

Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Mathematics Department, Università di Bologna, Bologna, Italy
Mattia Galeotti & Giovanna Citti
CNRS, CAMS - EHESS, Paris, France
Alessandro Sarti

Authors

Mattia Galeotti
View author publications
You can also search for this author in PubMed Google Scholar
Giovanna Citti
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Sarti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mattia Galeotti.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

In this first appendix, we prove that the integral

$$\begin{aligned} \int _{M}\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}k \end{aligned}$$

(A.1)

is well defined for any pair $({\tilde{x}},{\tilde{y}})\in {\mathbb {R}}^2$ and has finite value 0, independently from the pair $({\tilde{x}},{\tilde{y}})$.

.

Lemma A.1

For any pairs $({\tilde{x}},{\tilde{y}})$ and $({\tilde{x}}',{\tilde{y}}')$ in ${\mathbb {R}}^2$, we have

$$\begin{aligned} \int _{M}\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}k=\int _{M}\psi ^k({\tilde{x}}',{\tilde{y}}')\mathrm{d}k. \end{aligned}$$

Proof

If we consider the variable change $(x,y)\mapsto (x-{\tilde{x}},y-{\tilde{y}})$, then

$$\begin{aligned}&\int _{M}\psi ^k({\tilde{x}},{\tilde{y}})\mathrm{d}k\\&=\int _{M}\frac{1}{\sigma ^{3/2}}\psi _0(A^{-1}_{\theta ,\sigma }({\tilde{x}}-x,{\tilde{y}}-y))\mathrm{d}x\mathrm{d}y\mathrm{d}\theta \mathrm{d}\sigma \\&= \int _{M}\frac{1}{\sigma ^{3/2}}\psi _0(A^{-1}_{\theta ,\sigma }(-x,-y))\mathrm{d}x\mathrm{d}y\mathrm{d}\theta \mathrm{d}\sigma \\&=\int _{M}\psi ^k(0,0)\mathrm{d}k, \end{aligned}$$

and this proves the independence from $({\tilde{x}},{\tilde{y}})$. $\square $

In order to prove the convergence of (A.1), we rewrite it by making the change to polar coordinates $({\tilde{x}},{\tilde{y}})\mapsto (r,\alpha )$, such that

$$\begin{aligned} {\tilde{x}}=r\cos (\alpha ),\ {\tilde{y}}=r\sin (\alpha ). \end{aligned}$$

In order to have a simpler notation, we write $({\tilde{x}}_\theta ,{\tilde{y}}_\theta )=R_\theta ({\tilde{x}},{\tilde{y}})$, that is ${\tilde{y}}_\theta =r\sin (\alpha +\theta )$. In this notation, we have

$$\begin{aligned}&\int _{M}\psi ^k(0,0)\mathrm{d}k\\&=\int _{M}\frac{1}{\sigma ^{3/2}}\cdot e^{-\sigma ^{-2}({\tilde{x}}^2+{\tilde{y}}^2)}\sin (-2\sigma ^{-1}{\tilde{y}}_{\theta }) \mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}} \mathrm{d}\theta \mathrm{d}\sigma \\&= \int _{M}\frac{re^{-\sigma ^{-2}r^2}}{\sigma ^{3/2}}\sin \left( -\frac{2r}{\sigma }\sin (\alpha +\theta )\right) \mathrm{d}r\mathrm{d}\alpha \mathrm{d}\theta \mathrm{d}\sigma . \end{aligned}$$

Proposition A.2

The following integral converges to 0,

$$\begin{aligned} \int _{M}\psi ^k(0,0)\mathrm{d}k. \end{aligned}$$

Proof

We rewrite again the integral as

$$\begin{aligned}&\int _{S^1\times {\mathbb {R}}^+}\mathrm{d}\theta \sigma ^{1/2}\mathrm{d}\sigma \\&\int _{S^1\times {\mathbb {R}}^+} \frac{1}{\sigma }\mathrm{d}r\mathrm{d}\alpha \frac{re^{-\sigma ^{-2}r^2}}{\sigma } \sin \left( -\frac{2r}{\sigma }\sin (\alpha +\theta )\right) . \end{aligned}$$

We consider the coordinate changes $\alpha '= \alpha -\theta $ and $s=\frac{r}{\sigma }$, obtaining the integral

$$\begin{aligned} \int _{S^1\times {\mathbb {R}}^+}2\pi \sigma ^{1/2}\mathrm{d}\sigma \int _{S^1\times {\mathbb {R}}^+}\mathrm{d}s\mathrm{d}\alpha 'se^{-s^2} \sin \left( -2s\sin {\alpha '}\right) .\nonumber \\ \end{aligned}$$

(A.2)

We have

$$\begin{aligned}&\left| \int _{S^1\times {\mathbb {R}}^+}\mathrm{d}s\mathrm{d}\alpha ' se^{-s^2}\sin \left( -2s\sin \alpha '\right) \right| \\&\le 2\pi \int _0^\infty se^{-s^2}\mathrm{d}s=\pi , \end{aligned}$$

therefore the internal integral in (A.2) has finite value. As the sinus is an odd function, its value is 0. Therefore by the Fubini’s theorem we have $\int _M\psi ^k(0,0)=0$ too. $\square $

Appendix B

The goal of this appendix is to prove that the integral

$$\begin{aligned} C_{\psi }=\int _{{\mathbb {R}}^2}\frac{\left| {\widehat{\psi }}_0(\xi )\right| ^2}{|\xi |^2}\mathrm{d}\xi \end{aligned}$$

(B.1)

is finite and for any $f,g\in L^2({\mathbb {R}}^2)$,

$$\begin{aligned} \int _{M}\frac{\mathrm{d}k}{\sigma ^2}\langle f,\psi ^k\rangle \langle \psi ^k,g\rangle =C_\psi \cdot \langle f,g\rangle . \end{aligned}$$

(B.2)

Observe preliminarily that $\widehat{\psi _0}(0)=0$. Indeed,

$$\begin{aligned} \int _{{\mathbb {R}}^2} e^{-{\tilde{x}}^2-{\tilde{y}}^2}\sin (2{\tilde{y}})\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}=0, \end{aligned}$$

because the sinus is an odd function.

Lemma B.1

The integral (B.1) is finite.

Proof

We use the notation $\xi =(\xi _1,\xi _2)$. Moreover, we observe $\sin (2{\tilde{y}})=\frac{1}{2i}(e^{2i{\tilde{y}}}-e^{-2i{\tilde{y}}})$. Therefore, we have

$$\begin{aligned} \widehat{\psi _0}(\xi )&=\int _{{\mathbb {R}}^2}\psi _0({\tilde{x}},{\tilde{y}})e^{-2\pi i ({\tilde{x}},{\tilde{y}})\cdot \xi }\mathrm{d}{\tilde{x}} \mathrm{d}{\tilde{y}}\\&=\int _{{\mathbb {R}}^2}e^{-{\tilde{x}}^2-{\tilde{y}}^2-2\pi i \xi _1{\tilde{x}}-2\pi i \xi _2{\tilde{y}}}\cdot \frac{e^{2i{\tilde{y}}}-e^{-2i{\tilde{y}}}}{2i}\mathrm{d}{\tilde{x}}\mathrm{d} {\tilde{y}}. \end{aligned}$$

By completing the squares, we obtain

$$\begin{aligned}&\widehat{\psi _0}(\xi )=\frac{1}{2i}e^{-\pi ^2\xi _1^2}\\&\int _{{\mathbb {R}}^2}e^{-({\tilde{x}}+i\pi \xi _1)^2} \left( e^{-(\pi +1)^2\xi _2^2}e^{-({\tilde{y}}+i(\pi +1)\xi _2)^2}\right. \\&\left. -e^{-(\pi -1)^2\xi _2^2}e^{-({\tilde{y}}+i(\pi -1)\xi _2)^2}\right) \mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}. \end{aligned}$$

For $c\in {\mathbb {R}}$, we denote by D(c) the value

$$\begin{aligned} D(c):=\int _{\mathbb {R}}e^{-({\tilde{x}}+ic)^2}\mathrm{d}{\tilde{x}}, \end{aligned}$$

which is known to be finite. Therefore

$$\begin{aligned} \widehat{\psi _0}(\xi )&=\frac{1}{2i}e^{-\pi ^2\xi _1^2}D(\pi \xi _1) \left( e^{-(\pi +1)^2\xi _2^2}D((\pi +1)\xi _2)\right. \\&\quad \left. -e^{-(\pi -1)^2\xi _2^2}D((\pi -1)\xi _2)\right) . \end{aligned}$$

We know that $D(0)=\sqrt{\pi }$, the integral of the Gaussian function. Therefore the development of $|\widehat{\psi _0}|$ around 0 is

$$\begin{aligned} \left| \widehat{\psi _0}(\xi )\right| =2\pi ^2\xi _2^2+ o(|\xi |^3). \end{aligned}$$

Moreover, $D(c)\le D(0)$ for any $c\in {\mathbb {R}}$, then at infinity the function behaves as the difference of two Gaussians. This proves that

$$\begin{aligned} \int _{{\mathbb {R}}^2}\frac{\left| \widehat{\psi _0}(\xi )\right| ^2}{|\xi |^2}\mathrm{d}\xi \end{aligned}$$

is a convergent integral. $\square $

Proposition B.2

The equality (B.2) is satisfied for any pair of functions $f,g\in L^2({\mathbb {R}}^2)$.

Proof

We start by observing that, for any $k=(x,y,\theta ,\sigma )$,

$$\begin{aligned} \widehat{\psi ^k}(\xi )=\sigma ^{1/2}\cdot \widehat{\psi _0}(\sigma R_{-\theta }\xi )\cdot e^{-2\pi i((x,y)\cdot \xi )}. \end{aligned}$$

(B.3)

Indeed, with the coordinate change $({\tilde{x}}',{\tilde{y}}')=\sigma ^{-1}R_{-\theta }({\tilde{x}}-x,{\tilde{y}}-y)$ we have

$$\begin{aligned}&\widehat{\psi ^k}(\xi )\\&=\int _{{\mathbb {R}}^2} \psi _0(\sigma ^{-1}R_{-\theta }({\tilde{x}}-x, {\tilde{y}}-y))e^{-2\pi i(({\tilde{x}},{\tilde{y}})\cdot \xi )}\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}}\\&=\left( \int _{{\mathbb {R}}^2}\psi _0({\tilde{x}}',{\tilde{y}}')e^{-2\pi i(({\tilde{x}}',{\tilde{y}}')\cdot (\sigma R_{-\theta }\xi ))}\sigma ^{1/2}\mathrm{d}{\tilde{x}}' \mathrm{d}{\tilde{y}}'\right) \\&\quad \times e^{-2\pi i ((x,y)\cdot \xi )}\mathrm{d}{\tilde{x}}\mathrm{d}{\tilde{y}},\\ \end{aligned}$$

and therefore we obtain (B.3).

Moreover we observe that

$$\begin{aligned}&\int _{{\mathbb {R}}^2}\widehat{\psi ^k}(\xi )\cdot \overline{ f(\xi )}\mathrm{d}\xi \\&=\int _{{\mathbb {R}}^2}\sigma ^{1/2} \cdot \widehat{\psi _0}(\sigma R_{-\theta }\xi )\cdot \overline{ f(\xi )} \cdot e^{-2\pi i((x,y)\cdot \xi )}\mathrm{d}\xi \end{aligned}$$

is the Fourier transform of $F_f(\xi ):=\sigma ^{1/2} \cdot \widehat{\psi _0}(\sigma R_{-\theta }\xi )\cdot \overline{ f(\xi )}$ evaluated in (x, y).

In order to prove (B.2), we observe that by the Plancherel theorem,

$$\begin{aligned}&\int _{M}\langle f,\psi ^k\rangle \langle \psi ^k,g\rangle \frac{\mathrm{d}k}{\sigma ^2}\\&=\int _{M}\frac{\mathrm{d}\theta \mathrm{d}\sigma }{\sigma ^2}\left( \int f(\xi )\overline{\widehat{\psi ^k}(\xi )}\mathrm{d}\xi \right) \left( \int \widehat{\psi ^k}(\xi )\overline{g(\xi )}\mathrm{d} \xi \right) \\&=\int _{M}\frac{\mathrm{d}\theta \mathrm{d}\sigma }{\sigma ^2}\overline{\widehat{F_f}(x,y)}\widehat{F_g}(x,y)\mathrm{d}\xi \\&=\int _{M}\frac{\mathrm{d}\theta \mathrm{d}\sigma }{\sigma ^2}\sigma \cdot f(\xi )\cdot \overline{g(\xi )}\cdot \left| \widehat{\psi _0}(\sigma R_{-\theta }\xi )\right| ^2\mathrm{d}\xi . \end{aligned}$$

By Fubini’s theorem, we have

$$\begin{aligned}&\int _{M}\langle f,\psi ^k\rangle \langle \psi ^k,g\rangle \frac{\mathrm{d}k}{\sigma ^2}\\&=\int _{{\mathbb {R}}^2}\mathrm{d}\xi f(\xi )g(\xi )\int _0^{2\pi }\int _0^\infty \frac{\mathrm{d}\theta \mathrm{d}\sigma }{\sigma }\left| \widehat{\psi _0}(\sigma R_{-\theta }\xi )\right| ^2\\&=\int _{{\mathbb {R}}^2}\mathrm{d}\xi f(\xi )g(\xi )\int _{{\mathbb {R}}^2}\frac{\left| \widehat{\psi _0}(h)\right| ^2}{|h|}\mathrm{d}h\\&=\langle f,g\rangle \cdot C_\psi , \end{aligned}$$

where we used the change of coordinates $(\theta ,\sigma )\mapsto h=\sigma R_{-\theta }\xi $. $\square $

Appendix C

In this appendix, we prove that equality (5.2) holds not only in the weak sense, but also in a much stronger version. In particular, for any $\sigma _1,\sigma _2, B\in {\mathbb {R}}^+$ and $f\in L^2({\mathbb {R}}^2)$, we define

$$\begin{aligned} \int _0^{2\pi }\mathrm{d}\theta \int _{\sigma _1}^{\sigma _2}\frac{\mathrm{d}\sigma }{\sigma ^2}\int _{\left\Vert (x,y)\right\Vert \le B}\mathrm{d}x \mathrm{d}y \langle f, \psi ^k\rangle \psi ^k \end{aligned}$$

as the unique element in $L^2({\mathbb {R}}^2)$ whose inner product with any $g\in L^2({\mathbb {R}}^2)$ is

$$\begin{aligned} \int _0^{2\pi }\mathrm{d}\theta \int _{\sigma _1}^{\sigma _2}\frac{\mathrm{d}\sigma }{\sigma ^2}\int _{\left\Vert (x,y)\right\Vert \le B}\mathrm{d}x \mathrm{d}y \langle f, \psi ^k\rangle \langle \psi ^k, g\rangle . \end{aligned}$$

Proposition C.1

For any $f\in L^2({\mathbb {R}}^2)$,

$$\begin{aligned} \lim _{\begin{array}{c} \sigma _1\rightarrow 0\\ \sigma _2,B\rightarrow \infty \\ \end{array}}\left\Vert f-C_\psi ^{-1}\int _0^{2\pi }\int _{\sigma _1}^{\sigma _2}\int _{\left\Vert (x,y)\right\Vert \le B} \langle f,\psi ^k\rangle \psi ^k \frac{1}{\sigma ^2}\mathrm{d}x\mathrm{d}y\mathrm{d}\theta \mathrm{d}\sigma \right\Vert =0. \end{aligned}$$

Proof

The term we are estimating is

$$\begin{aligned} \sup _{\left\Vert g\right\Vert =1}\left| \left\langle f-C_\psi ^{-1}\int _0^{2\pi }\int _{\sigma _1}^{\sigma _2}\int _{\left\Vert (x,y)\right\Vert \le B} \langle f,\psi ^k\rangle \psi ^k\frac{\mathrm{d}k}{\sigma ^2}, \ g\right\rangle \right| \end{aligned}$$

which is bounded above by

$$\begin{aligned} \sup _{\left\Vert g\right\Vert =1}\left| C_\psi ^{-1}\int _{k\in U_{\sigma _1,\sigma _2,B}}\langle f,\psi ^k\rangle \langle \psi ^k,g\rangle \frac{\mathrm{d}k}{\sigma ^2}\right| , \end{aligned}$$

where $U_{\sigma _1,\sigma _2, B}\subset M$ is defined by

$$\begin{aligned} U_{\sigma _1,\sigma _2,B}=\left\{ k\in M|\ \sigma <\sigma _1 \text{ or } \sigma>\sigma _2 \text{ or } \left\Vert (x,y)\right\Vert > B\right\} . \end{aligned}$$

By Caucy-Schwarz inequality, this is again bounded by

$$\begin{aligned}&\le \sup _{\left\Vert g\right\Vert =1}\left| C_\psi ^{-1}\int _{k\in U_{\sigma _1,\sigma _2,B}}|\langle f,\psi ^k\rangle |^2\frac{\mathrm{d}k}{\sigma ^2}\right| ^{1/2}\\&\times \cdot \left| C_\psi ^{-1}\int _M|\langle g,\psi ^k\rangle |^2\frac{\mathrm{d}k}{\sigma ^2}\right| ^{1/2}. \end{aligned}$$

The second term is $\left\Vert g\right\Vert =1$ by equality (B.2). The first term tends to 0 for $\sigma _1\rightarrow 0$ and $\sigma _2,B\rightarrow \infty $, because the infinite integral $\int _M|\langle f,\psi ^k\rangle |^2\frac{\mathrm{d}k}{\sigma ^2}$ converges to $\left\Vert f\right\Vert ^2$. $\square $

Appendix D

In order to evaluate the distance $d_c$ between two points $p_0=(x_0,y_0,\theta _0,\sigma _0)$ and $p_1=(x_1,y_1,\theta _1,\sigma _1)$, we search for the constant coefficients that allows a vector field Y whose flow relies $p_0$ to $p_1$.

We start by observing that

$$\begin{aligned} {\dot{\theta }}=c_2, \end{aligned}$$

therefore $c_2=\theta _1-\theta _0$ and

$$\begin{aligned} \theta _t=c_2\cdot t+\theta _0. \end{aligned}$$

Regarding the $\sigma $ variable, we have

$$\begin{aligned} {\dot{\sigma }}=\frac{c_4}{h_2}\cdot \sigma , \end{aligned}$$

therefore $c_4=\ln \left( \frac{\sigma _1}{\sigma _0}\right) \cdot h_2$ and

$$\begin{aligned} \sigma _t=\sigma _0\cdot e^{\frac{c_4t}{h_2}}. \end{aligned}$$

In what follows, we will use the notation ${\tilde{c}}_4:=\frac{c_4}{h_2}$.

Regarding the first two coordinates

$$\begin{aligned} \left( {\begin{array}{c}\dot{x}\\ \dot{y}\end{array}}\right) =\left( \begin{array}{cc} \cos \theta &{}-\frac{\sigma }{h_1}\cdot \sin \theta \\ \sin \theta &{} \frac{\sigma }{h_1}\cdot \cos \theta \end{array}\right) \left( {\begin{array}{c}c_1\\ c_3\end{array}}\right) . \end{aligned}$$

We denote by $S_t$ the matrix obtained by integrating the matrix above up to time t, therefore

$$\begin{aligned} S_t=\left( \begin{array}{cc} S_t^{11} &{} S_t^{12}\\ S_t^{21} &{} S_t^{22} \end{array}\right) \end{aligned}$$

with

$$\begin{aligned} S_t^{11}&=\left[ \frac{\sin (\theta _t)}{c_2}\right] ^t_0\\ S_t^{12}&=\left[ \frac{1}{h_1\cdot ({\tilde{c}}_4^2+c_2^2)}\cdot \left( c_2\cdot \sigma _t\cos (\theta _t) -{\tilde{c}}_4\cdot \sigma _t\sin (\theta _t)\right) \right] ^t_0\\ S_t^{21}&=\left[ \frac{-\cos (\theta _t)}{c_2}\right] ^t_0\\&S_t^{22}\\&=\left[ \frac{1}{h_1\cdot ({\tilde{c}}_4^2+c_2^2)}\cdot \left( {\tilde{c}}_4\cdot \sigma _t\cos (\theta _t)+c_2\cdot \sigma _t\sin (\theta _t)\right) \right] ^t_0. \end{aligned}$$

We obtain

$$\begin{aligned} \left( {\begin{array}{c}c_1\\ c_3\end{array}}\right) =S_1^{-1}\left( {\begin{array}{c}x_1-x_0\\ y_1-y_0\end{array}}\right) \end{aligned}$$

and

$$\begin{aligned} \left( {\begin{array}{c}x_t\\ y_t\end{array}}\right) =S_t\left( {\begin{array}{c}c_1\\ c_3\end{array}}\right) +\left( {\begin{array}{c}x_0\\ y_0\end{array}}\right) . \end{aligned}$$

For any two points as above, their distance is

$$\begin{aligned} d_c(p_1,p_0)^2=c_1^2+c_2^2+c_3^2+c_4^2. \end{aligned}$$

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Galeotti, M., Citti, G. & Sarti, A. Cortically Based Optimal Transport. J Math Imaging Vis 64, 1040–1057 (2022). https://doi.org/10.1007/s10851-022-01116-9

Download citation

Received: 29 July 2021
Accepted: 05 July 2022
Published: 30 July 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10851-022-01116-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Cortically Based Optimal Transport

Abstract

Similar content being viewed by others

Time Discrete Geodesics in Deep Feature Spaces for Image Morphing

Modelling of the Poggendorff Illusion via Sub-Riemannian Geodesics in the Roto-Translation Group

Image Morphing in Deep Feature Spaces: Theory and Applications

1 Introduction

2 From the Retina to the Output Space

Definition 2.1

Remark 2.2

Remark 2.3

3 The Optimal Transport Problem

Definition 3.1

Proposition 3.2

Lemma 3.3

Definition 3.4

Remark 3.5

Definition 3.6

Definition 3.7

Definition 3.8

Remark 3.9

Definition 3.10

Theorem 3.11

Lemma 3.12

Proposition 3.13

Proof

Definition 3.14

Theorem 3.15

Proof

4 Geodesics in \({\mathcal {P}_2(X)}\)

Definition 4.1

Definition 4.2

Theorem 4.3

Remark 4.4

5 Reconstructing the Visual Input via the Gabor Frame

Definition 5.1

Lemma 5.2

6 Deformation of the Output

Lemma 6.1

Proof

Remark 6.2

Remark 6.3

Remark 6.4

7 Constraining the Output

Theorem 7.1

Remark 7.2

Definition 7.3

Theorem 7.4

Proof

8 Discrete Model

Definition 8.1

Remark 8.2

Remark 8.3

Definition 8.4

Proposition 8.5

Remark 8.6

9 Implementation

10 Conclusions and Future Developments

Data availability and Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A

Lemma A.1

Proof

Proposition A.2

Proof

Appendix B

Lemma B.1

Proof

Proposition B.2

Proof

Appendix C

Proposition C.1

Proof