Cortically based optimal transport

We introduce a model for image morphing in the primary visual cortex V1 to perform completion of missing images in time. We model the output of simple cells through a family of Gabor filters and the propagation of the neural signal accordingly to the functional geometry induced by horizontal connectivity. Then we model the deformation between two images as a path relying two different outputs. This path is obtained by optimal transport considering the Wasserstein distance geodesics associated to some probability measures naturally induced by the outputs on V1. The frame of Gabor filters allows to project back the output path, therefore obtaining an associated image stimulus deformation. We perform a numerical implementation of our cortical model, assessing its ability in reconstructing rigidi motions of simple shapes.


Introduction
The functional geometry of the visual cortex is a widely studied subject. It is known that cells of the primary visual cortex V1 are sensitive to specific features of the visual stimulus, like position, orientation, scale, colour, curvature, velocity and many others [17]. In the seventies the neurophysiologists Hubel and Wiesel discovered the modular organisation of the primary visual cortex [18], meaning that cells are spatially organized in such a way that for every point (x, y) of the retinal plane there is an entire set of cells, each one sensitive to a particular instance of the considered feature. This organisation corresponds to the so-called hypercolumnar structure. Hypercolumns of cells are then connected to each other by means of the horizontal connectivity, allowing cells of the same kind but sensitive to different points (x, y) of the stimulus to communicate. Hypercolumnar organization and neural connectivity between hypercolumns constitute the functional architecture of the visual cortex, that is the cortical structure underlying the low level processing of the visual stimulus. The mathematical modelling of the functional architecture of the visual cortex in terms of differential geometry was introduced in the seminal works of Hoffmann [15,16], who proposed to model the hypercolumnar organization in terms of a fiber bundle. Many of such results dealing with differential geometry were given a unified framework under the new name of neurogeometry.
Petitot and Tondut [23], related the contact geometry introduced by Hoffmann with the geometry of illusory contours of Kanizsa [19]. The problem of completion of occluded object was afforded by computing geodesic curves in the contact structure.
Then, in [8] Citti and Sarti showed how the functional architecture could be described in terms of Lie groups structures. In particular, as proved by Daugman [11] the receptive profiles of simple cells can be modelled as a Gabor filter. Since these filters can be obtained via rotation and translation from a fixed one, the functional architecture of the whole family of simple cells have been described as the Euclidean motion group SE(2) [8]. In presence of a visual stimulus I : R 2 → [0, 1] on the retinal plane, the action of the whole family of simple cells is obtained by convolving the function I with the bank of Gabor filters. The output of the cells action will be a function µ : R 2 × S 1 → R. The horizontal connectivity is strongly anisotropic and it is modelled via a sub-Riemannian metric. Since it is very common that part of the visual input is occluded, the action of horizontal connectivity allows to complete the missing part by means of propagation in such a space, under the action of advection diffusion differential operators of Fokker-Planck type. Visual completion problems are then solved via geodesics or minimal surfaces. This approach was extended to scale in [29], to space-time in [4] and to frequency in [6]. In [13] and [30] the lifting has been extended to heterogeneous features defined in different groups. For an extended review on neurogeometry see [9].
In this paper we aim to reconsider the problem of completion of missing stimulus in time by means of morphing of one lifted cortical image in a different one in terms of optimal transport of a probability distribution in the functional geometry of the cortex. Two images can represent the same object at two different intervals of times, and different algorithms have been proposed to perform completion of missing images between the two. We recall the results of [32] and the model proposed by [33] on the image plane as a geodesic in the Wasserstein space.
We propose a cortical version of this phenomena, using geodesics in the cortical space endowed with the Wasserstein distance. We work in the manifold M = R 2 × S 1 × R + of cells sensible to position orientation and scale (see also [29]), and we develop a model that treats well shape rotations. We point out that the sub-Riemannian metric in R 2 × S 1 is important to keep together the shape of the object along the rotational movements. In fact the distance function in this metric approximates the statistical correlations of boundaries in natural images [28].
In order to obtain a metric deformation, we consider the positive and negative part of the output µ at a given time, normalized as two probability measures µ + , µ − over M . Following the papers [2,3] of Ambrosio and Gigli, we consider the space P 2 (X) of probabilities with finite 2-momentum over X = R 2 × S 1 × R, endowed with the associated Wasserstein distance; on this space we can find, for any pair of inputs I 0 , I 1 taken at different times, a unique constant speed geodesic relying their associated output measures.
In particular, for any regular measure µ ∈ P 2 (X), we obtain Theorem 3.15, a generalization of previously known results, and it assures that for any measure ν ∈ P 2 (X) there exists a unique transport map T in the sense of Monge's formulation of optimal transport. Using this transport map, it is possible to give an explicit description of the constant speed geodesics in P 2 (X); in the case of the measures induced by the output functions, this is done in Remark 4.4 and equation (6.3). Using the frame properties of the family of Gabor filters, we reconstruct with equation (5.5) a path of images I t relying the initial and final input I 0 , I 1 .
In the two final sections, we develop a numerical implementation of our model. In fact, the frame generated by the odd Gabor mother function is not invertible in a discrete setting, and in order to make our model workable, we use a Wavelet Gabor Pyramid generated by an odd and an even Gabor function. The outputs obtained via these frames are transported following the same approach detailed above. As shown in Section 9, our model allows the deformation of simple shapes through rotation and translation, preserving the basic structures in the treated pictures. As we discuss, the same preservation is not attained by the numerical 2-dimensional 'classical' implementation of optimal transport, meaning the implementation of the optimal transport between the inputs I 0 , I 1 seen as measures on R 2 and taking the square Euclidean distance as the cost function.
In Section 2 we introduce the operation associating to any input an output function via the convolution with the family of Gabor filters. In Section 3 the classical problem of optimal transport is introduced, with the techniques necessary for a general solution. In Section 4 we describe the constant speed geodesics in P 2 (X). In Section 5 we treat the properties of continuous frames such as the one of Gabor functions; they are fundamental in reconstructing a path of input images from the deformation path of output functions. Section 6 states that the measure deformation results are valid in our case. Finally, in Section 7 we find a constraining condition implying that the path of output functions µ t is naturally induced by a path of input images I t . In Section 8 we introduce the discrete setting and the Wave Gabor Pyramid. Finally in Section 9 we discuss the numerical implementation and its results.

From the retina to the output space
We consider an input function I : . This functions models an input received in the retina plan and it induces an ouput function on the cortex. In order to introduce the output function, we recall the odd series of Gabor filters. We call Gabor mother filter the function Moreover, we consider the roto-dilation defined by for any θ ∈ S 1 and σ ∈ R + . We also consider the application All this allows the definition of a family of Gabor filters ψ x,y,θ,σ (x,ỹ) : x,y,θ,σ (x,ỹ)).
In what follows we will denote the filter ψ 0,0,0,1 by ψ 0 when there is no risk of confusion. We consider the variety M := R 2 × S 1 × R + with its natural (Lebesgue) measure dk, this is the output space where we build the µ function induced by the input I. For any point k = (x, y, θ, σ) ∈ M , we denote by ψ k the Gabor filter ψ x,y,θ,σ . Definition 2.1. Consider a point k = (x, y, θ, σ) on M , then the output function of a cell in response to the visual input I is Remark 2.2. We work in the case M = R 2 × S 1 × R + and we put on M the Riemannian structure which endows the neurogeometry of the cortex. In particular the R 2 factor is the retinal plan with the natural projection M → R 2 , θ ∈ S 1 is an angular parameter that encodes the border orientation in the processes of border recognition, while σ ∈ R + is a scale parameter in the same process.
In order to define this metric g, consider the following four vector fields, in every point they span the M tangent bundle, Let g be the metric such that at any point 1 √ σX 4 is an orthonormal system. That is at every point the metric g is represented by the matrix For σ that tends to 0, g tends to the sub-riemannian structure treated in [8] (module rescaling), for σ → +∞ it approaches a hyperbolic metric.
In this setting the measure µ · dk = I, ψ k · dk on M , is I, ψ k times the measure induced by the metric g.
As proved in Appendix A, the integral M ψ k (x,ỹ)dk has finite value 0, independently of the pair (x,ỹ).

The optimal transport problem
We recall the classical Kantorovich's formulation of optimal transport. Our main reference is the Ambrosio-Gigli guide [3]. For any Polish space X (i.e. a complete and separable metric space) we denote by P(X) the set of Borel probability measures on X and by B(X) the set of Borel sets on X. Consider two Polish spaces X, Y , if µ ∈ P(X) and T : X → Y is a Borel map, then we denote by T # µ ∈ P(Y ) the pushforward of µ through T , defined by Consider the natural product X × Y and its associated projections π X , π Y . Let c : X → Y be a Borel map called cost function, and consider two measures µ ∈ P(X) and ν ∈ P(Y ). Definition 3.1. An admissible transport plan between µ and ν is a measure γ ∈ P(X × Y ) such that π X # γ = µ and π Y # γ = ν, or equivalently We denote the set of admissible transport plans between µ and ν by Adm(µ, ν).
We want to minimize the integral X×Y c(x, y)dγ(x, y) for all the admissible transport plans between µ and ν. We say that γ is induced by a transport map if there exists a Borel map T : An optimal transport plan is a transport plan γ that realizes the infimum above. It is known that such a minimizer exists under very general conditions. We denote the set of optimal plans by Opt(µ, ν).
Proposition 3.2 (see [31,Theorem 4.1]). Consider µ ∈ P(X) and ν ∈ P(Y ). If the cost function c is lower semicontinuous and bounded from below, then there exists an optimal plan γ for the functional among all γ ∈ Adm(µ, ν).
We are interested in the cases where an optimal plan is induced by a transport map T . We state [3, Lemma 1.20], referring to Ambrosio-Gigli paper for a proof. Lemma 3.3. Consider γ ∈ Adm(µ, ν). Then γ is induced by a map if and only if γ is concentrated in a measurable set Γ ⊂ X × Y such that for µ-a.e. x there exists only one y = T (x) in Γ ∩ ({x} × Y ). In this case, T (x) induces γ.
In order to introduce the notion of c-concavity for some cost function c, and in order to show the existence and uniqueness of an optimal transport plan γ induced by a transport map T , we give some definitions following [3, Chap.1].
Definition 3.4 (Superdifferential). Consider M a riemannian manifold and any function ϕ : M → R, we define its superdifferential at any point x ∈ M , , ϕ − h attains a local maximum at x . When there is no risk of confusion, we will denote by ∂ + ϕ the associated subspace of the total space T * M . The subdifferential ∂ − ϕ is defined analogously as the set of differentials dh(p) where ϕ − h attains a local minimum of ϕ − h. , z)). The same for ∂ − ϕ(x) with inversed inequality. With this definition, ∂ + ϕ and ∂ − ϕ are subspaces of T M .
It is well known that if ϕ is differentiable at x ∈ M , its superdifferential and subdifferential at x coincide and the contain only the ϕ gradient, Consider two Polish spaces X, Y , and a cost function c : X × Y → R.
Definition 3.7 (c-concavity). We say that a function ϕ : Analogously we have a notion of c-convexity.
With the notation of [31, Chap.10], the definition above describes a semiconcave function with modulus ω(t) = K t 2 2 . Remark 3.9. Observe that by [31,Equation (10.14)], if f is semiconcave, superdifferentiable at x and q ∈ ∂ + f (x), then Definition 3.10 (c-superdifferential). Consider ϕ : We denote by Consider two probability measures µ ∈ P(X) and ν ∈ P(Y ). In the following we will consider a cost function c : X × Y → R such that there exists two functions a ∈ L 1 (µ), b ∈ L 1 (ν), respecting the inequality below.
Lemma 3.12. For any point y ∈ X, the function d 2 (−, y)/2 is uniformly semiconcave on X.
This is proven following the line of reasoning of [31,Third Appendix], because X is flat and therefore its sectional curvature is everywhere 0.
Proposition 3.13. Consider a c-concave function ϕ : X → R, then it must be semiconcave. Furthermore, for any x ∈ X, exp −1 Proof. This is a slight generalization of [3, Proposition 1.30], and we develop the same arguments. As stated by Lemma 3.12, for any y ∈ X the distance function d 2 (−, y)/2 is semiconcave, therefore by Remark 3.9 this implies Definition 3.14 (Regular measure). We say that a measure µ ∈ P(X) is regular if it vanishes on the set of points of non differentiability of any semiconcave function ϕ : X → R.
For any Polish space (X, d), we introduce the space of probability measures on X with finite 2-momentum with respect to d, x, x 0 )dµ < ∞ for some, and thus any, x 0 ∈ X .
We regard P 2 (X) as a metric space with respect to the sup norm.
Theorem 3.15. Consider the riemannian manifold X and a probability measure µ ∈ P 2 (X).
If µ is regular, then for every ν ∈ P 2 (X) there exists only one transport plan from µ to ν and it is induced by a map T . If this is the case, the map T can be written as x → exp x (−∇ϕ(x)) for some c-concave function ϕ : X → R.
Observe that this is a generalization of [3, Theorem 1.33] to the case of the non-compact riemannian manifold X. For another reference see also [14].
Proof. In order to apply Theorem 3.11 we are going to verify that condition (3.1) is respected. In particular we want to show that taking a(x) = d 2 (x, x 0 ) and b(y) = d 2 (y, x 0 ) for any To have a ∈ L 1 (µ) (and therefore b ∈ L 1 (ν)) it suffices to have M d 2 (x, x 0 )µ(x) < ∞ but this is exactly the definition of µ ∈ P 2 (X). Thus as a consequence of Theorem 3.11, there exists a c-concave function ϕ such that any optimal plan γ is concentrated on ∂ c + ϕ. By Proposition 3.13, ϕ is semiconcave and therefore differentiable µ-a.e. by µ-regularity.
The introduction of the Wasserstein distance W 2 on P 2 allows the definition of geodesics in this space. We show a theorem of existence and uniqueness for such a geodesic relying a pair of measures.
We recall that (X, d) is called a geodesic space if for every x, y ∈ X there exists a constant speed geodesic connecting them. We consider the metric space Geod(X) of constant speed geodesics endowed with the sup norm.
On Geod(X) we introduce for any t ∈ [0, 1] the map e t : Geod(X) → X such that Furthermore we define the Wassertein distance associated to d on P 2 (X).
Definition 4.2 (Wasserstein distance). If µ, ν ∈ P 2 (X), then The following theorem is proved in [3]. For any two probability measure µ 0 , µ 1 on a Polish space, this gives a constant speed geodesic relying them.
We consider on X the Lebesgue measure dk. The measures µ that we are going to treat, are always induced by density a.e.-continuous functions.
Remark 4.4. We know that the geodesic relying two points on a complete riemannian manifold X is almost everywhere unique (see for example [31]). This means that the set of pairs (x, y) ∈ X 2 such that the geodesic between them is unique, has full measure. The same is true for any measure µ on X if it is induced by a density a.e.-continuous function.
Therefore the maps e t naturally induces almost everywhere a map that we denote in the same way: e t : X 2 → X, sending the pair (x, y) to the point γ t where γ is the geodesic such that γ 0 = x and γ 1 = y. Consider two probability measures µ 0 , µ 1 over X respecting the hypothesis of Theorem 3.15. Let T be the transport map between them, then as a consequence of Theorem 4.3 the (unique, coming from the uniqueness of T ) geodesic between µ 0 and µ 1 can be written as In what follows we will use the notation e (T ) t := e t • (id, T ), and therefore we will have

Reconstructing the visual input via the Gabor frame
We are going to introduce the notion of continuous frame, this allows the reconstruction of a function I on the retinal plane, from the datum of an output function µ.
Definition 5.1. Consider a Hilbert space H and a measure space M with a positive measure ρ. A continuous frame is a family of vectors {ψ k } k∈M such that k → f, ψ k is a measurable function on M for any f ∈ H, and there exists A, B > 0 such that Consider f, g ∈ H and the mapping This map is conjugated linear and moreover it is bounded. Indeed, By the Riesz' representation theorem, there exists a unique element h ∈ H which verifies h f = h, − . We denote this element by X f, ψ k ψ k dρ(k). We denote by S : H → H the operator (2) it is invertible; (3) the family {S −1 ψ k } k∈M is a continuous frame; (4) for any f ∈ H, where the equality is intended in the weak sense.
For a proof of this see [7, §5.8].
From now on, we suppose M = R 2 × S 1 × R + with the measure ρ such that dρ(k) = dk σ 2 . In our case, the vectors ψ k are the Gabor filters indexed by the points k = (x, y, θ, σ) ∈ M . Consider the usual scalar product , in L 2 (R 2 ), then in Appendix B we prove that there exists a constant C ψ ∈ R + such that for any pair of inputs I, I in L 2 (R 2 ) As a corollary, where the equality as to be intended in the weak sense, that is h I = C ψ I, − L 2 . In Appendix C we prove that the equality (5.3) is also true in a much stronger sense.
We observe that I, ψ x,y,θ,σ is exactly the output function µ associated to I, see Definition 2.1. Therefore by Lemma 5.2 and equality (5.3), In the next section, starting from two input functions I 0 , I 1 we produce a path µ t of output functions. In order to produce a path in the input space from µ t , we define (5.5)

Deformation of the output
In this section we show the existence of a path relying two output functions µ 0 , µ 1 . In particular for any output we obtain two probability measures from the positive and negative part of the output. Using the results of Section 3 and 4 we build the paths relying the associated measure, and conclude from there.
We introduce a new condition on the input image I : R 2 → [0, 1]. If µ is the output associated to I, that is We look at µ as a function defined on the whole X = R 2 × S 1 × R but it is supported only on M = R 2 × S 1 × R + ⊂ X. We define the two functionsμ + := max(µ, 0) andμ − := max(−µ, 0). We impose the condition (6.1) X d 2 (k, 0)μ + (k)dk < ∞, and the same forμ − . Lemma 6.1. If conditions (6.1) holds, then d 2 (k, k 0 )μ + (k)dk is finite for any k 0 ∈ X. Proof. We start by observing that Xμ + (k)dk < ∞. Indeed, if we consider the compact B = {k| d(k, 0) ≤ 1} and the maximum b ofμ + over B, we have Xμ For any k 0 ∈ X, by the triangular inequality we have The first two terms are clearly finite. Concerning the last one, and this concludes the proof. Therefore for any function µ : M → R there exists two probability densities µ + and µ − such that µ = m · (µ + − µ − ), where m is a positive coefficient and µ + · µ − ≡ 0. This means that µ + and µ − are the positive and negative part of µ, renormalized in order to become probability densities.
Given two inputs I 0 , I 1 , we define respectively µ + i , µ − i and m i for i = 0, 1. Using the equality (5.4) we obtain, We consider the complete riemannian variety X = R 2 ×S 1 ×R with the Lebesgue metric dk. In particular we consider µ + i and µ − i , for i = 0, 1, as measures on X even if they are defined over M ⊂ X. The function µ ± i is identified with µ ± i · dk if σ > 0 and with the null measure elsewhere. As a consequence of condition (6.1), µ + i , µ − i ∈ P 2 (X) for i = 0, 1.
We consider the pairs µ + 0 , µ + 1 and µ − 0 , µ − 1 , by Theorem 4.3 and Remark 4.4, there exists two constant speed geodesics µ + t and µ − t . Remark 6.3. In particular by Theorem 3.15 there exists a transport map T + such that µ + 1 = T + # µ + 0 and the same for the negative part with a transport map T − . Therefore, Remark 6.4. By definition of e t and of the transport map, the measures µ ± t are null outside M for any t ∈ [0, 1], therefore we can always look at them as measures in P 2 (M ).
Furthermore, we point out that by construction µ + t and µ − t are absolutely continuous with respect to the Lebesgue measure dk. In the following we will use the notation µ + t and µ − t indistinctly for the measures and for the density functions (defined over M ) when there is no risk of confusion.
We consider a linear variation of the mass m, this means that we define a varying coefficient We define the path of output functions using these coefficients, Finally from Equation (5.5) we obtain a path of input functions from I 0 to I 1 .

Constraining the output
In this section we introduce a useful tool to describe the geodesic µ t , the so called weak riemannian structure of (P 2 (X), W 2 ), the space of probabilities endowed with the Wassertein distance.
If µ t is an absolutely continuous curve in P 2 (X) (with respect to the Wassertein distance), consider a time dependent vector field v t on T X such that the following continuity equation is verified in the sense of distributions, For the proof of the following theorem we refer again to [3].

Theorem 7.1 (see [3, Theorem 2.29])
. If X is a smooth complete riemannian manifold without boundary, then (1) for every absolutely continuous curve µ t ∈ P 2 (X) there exists a Borel family of vector fields v t such that v t L 2 (µt) ≤ |μ t | for a.e. t and the continuity equation (7.1) is satisfied (in the sense of distributions); (2) if (µ t , v t ) satisfies (7.1) and 1 0 v t L 2 (µt) dt is finite, then µ t is an absolutely continuous curve (up to a negligible set of points) and |μ t | ≤ v t L 2 (µt) for a.e. t ∈ [0, 1].
Remark 7.2. We recall the Benamou-Brenier formula proved for example at [3, Proposition 2.30] and stating that the minimization problem solved by a geodesic relying µ 0 , µ 1 ∈ P 2 (X) can be reformulated in terms of the vector field v t . In particular we have where the infimum is taken among all weakly continuous distibutional solutions of the continuity equation for (µ t , v t ).
As a direct consequence of Theorem 7.1, for every absolutely continuous curve µ t in P 2 (X), there exists a family of vector fields (v t ) verifying the continuity equation and such that v t L 2 (µt) = |μ t | for a.e. t. This family is not unique in general, but it is unique if we define as follows the tangent space to P 2 (X) where the vector fields must live in. Definition 7.3. If µ ∈ P 2 (X) then the tangent space to P 2 (X) is defined as Therefore for any absolutely continuous curve µ t in P 2 (X), we have an associated vector field v t . In particular we have it for the geodesics µ + t and µ − t obtained via two inputs I 0 , I 1 (see Remark 6.3). Observe that both µ + t and µ − t are constant speed geodesics in (P 2 (X), W 2 ) therefore they are absolutely continuous curves, and so Theorem 7.1 applies to them.
We denote by v + t and v − t the vector fields associated respectively to µ + t and µ − t . Moreover, we define the normalized image so that We are interested in the existence of a family J t of inputs R 2 → [0, 1] that relies J 0 to J 1 , such that J t is in C 1 ([0, 1]; L 2 (R 2 )) and µt(k) mt = J t , ψ k for any k ∈ M . In order to find such a path, we observe that if it exists, then d dt From the continuity equation we know that We Indeed, by Theorem 7.1, v + t ∈ L 2 (µ + t ) and v − t ∈ L 2 (µ − t ). Therefore it is possible to extend both vector fields to the whole M in such a way that v + t · µ − t ≡ 0 and v − t · µ + t ≡ 0. Then we have, For an opportune vector field α ∈ T M we have In particular if we use the notation (x k ,ỹ k ) = σ −1 R −θ (x−x,ỹ−y), where k = (x, y, θ, σ) ∈ M , it is straightforward to verify that In particular for any k, the vector field α is well defined almost everywhere on R 2 . Therefore for a.e. t ∈ [0, 1] and every k ∈ M , Thus, if α · v t + ∇ · v t is independent of the variable k as a function R 2 → R, as ψ k k is a frame, this implies This last equality is true in the weak sense but also in the (stronger) sense showed in Appendix C.
In order to state our last theorem, we introduce two additional conditions. First we impose that the inputs I 0 , I 1 are null outside a compact subset of R 2 . This simply says that the images we are treating are limited in space.
We also impose that the vector fields take their values in a Sobolev space.
Proof. We use the notation u t := α · v t + ∇ · v t and observe that by (7.5), u t is independent of the point k. As a consequence of this independence, u t can be defined also where α is singular. We consider the differential equation and the coefficient m t as defined in (6.2). Given the initial function J 0 := I 0 /m 0 , the solution to the equation above is where h t := t 0 u s ds ∀t ∈ [0, 1] is a primitive of u t , and it exists as a consequence of v t ∈ W 1,∞ . The function J 0 is null outside a compact subset K ⊂ R 2 , and h t is continuous therefore limited over K. This implies J t ∈ L 2 (R 2 ) for any t ∈ [0, 1].
We define I t := m t · J t and ν t (k) := m t · J t , ψ k = I t , ψ k .
By construction, for any t ∈ [0, 1] ν t satisfies a.e. the continuity equation (7.1) with respect to the vector field v t . Moreover, The solution to the continuity equation under this conditions is unique for absolutely continuous measures (see [1]). Therefore ν t = µt mt a.e. for a.e. t ∈ [0, 1]. We observe that this also proves that m 1 J 1 = I 1 .

Discrete model
In order to produce an implementation of a transport model in the space M = R 2 ×S 1 ×R + , we have to work in a discrete setting, and therefore it is not possible to use the frame ψ k introduced above, because it is in fact a frame only when the index k varies along all M .
We consider instead a discrete frame that produce a new set of output functions. The frame we consider is the so called Gabor Wavelet Pyramid (see for instance [20]) that relies again in the Gabor complex mother function but we split it in its real and complex component, defining two sets of filter functions associated to the two following mother functions ψ e (x,ỹ) := e −x 2 −γ·ỹ 2 · cos(2πωỹ) ψ o (x,ỹ) := e −x 2 −γ·ỹ 2 · sin(2πωỹ).
As we have seen, in the case of the continuous frame, the elements of the frame are obtained via the action of the Heisenberg group on the mother function. In the case of wavelets the acting group is the affine group. We recall that by ψ e,θ and ψ o,θ we mean the same functions with the rotation by θ applied, that is ψ e,θ (x,ỹ) = ψ e (R −θ (x,ỹ)) = ψ e (cos(θ)x + sin(θ)ỹ, − sin(θ)x + cos(θ)ỹ) , and analogously for ψ o . As developed in [10] by Daubechies for one variable wavelets, and generalized to two variable wavelets by Lee in [21], for an opportune choice of a 0 , b 0 ∈ R, and n, k, , j ∈ Z ≥0 , the wavelets above form a frame, that is there exist real numbers A, B > 0 such that for any f ∈ L 2 (R) we have where the scalar product is the classical As a consequence it is possible to reconstruct the function, as where C is a constant and the equality is true in the weak sense. Observe that the splitting of the mother function in its even and odd part allows to consider only 'half' of the S 1 circle, or equivalently it allows to use an equipartition of the space of directions S 1 /{±1}.
In our setting, we consider in this case four output functions obtained by convoluting any input I : R 2 → [0, 1] with the frame above. Then, we apply the same procedure of Section 6, that is we use optimal transport between functions on M to build a path relying the initial output to the final output.
Instead of working over the whole M , we consider a compact subset where [0, D] 2 represents the portion of the plane which is registered on the retina, while [σ min , σ max ] is the interval where the scale parameter σ varies.
Remark 8.2. The idea behind the Gabor Wavelet Pyramid is that we define the output functions indexed by a discrete subset of M c . We use the notation 0, N for the subset {0, 1, . . . , N } ⊂ Z ≥0 for any positive integer N . We set a 0 = σ min and j = log a 0 (σ max ). For any j ∈ 1, j we consider the discrete subset thus defining different strata of a discrete subset of the whole M c .
Remark 8.3. We observe that any wavelet ψ n,k, ,j e (the same is true for ψ o ) is centered at (nb 0 a j 0 , kb 0 a j 0 ). Furthermore, given two inputs I 0 , I 1 , we can define the following functions on M c , for i = 0, 1,μ + i,e (nb 0 a j 0 , kb 0 a j 0 , θ , a j 0 ) = max I i , ψ n,k, ,j e , 0 μ − i,e (nb 0 a j 0 , kb 0 a j 0 , θ , a j 0 ) = − min I i , ψ n,k, ,j e , 0 and analogously forμ + i,o andμ − i,o . We normalize these functions in order to obtain the probabilities µ ± i,e for i = 0, 1 (and analogously for µ ± i,o ).
In order to build the transport path µ t for any of the pairs of output functions above, we focus on the distance over M c . In fact, instead of working with the distance treated in the previous sections, another distance turns out to be more efficient in our setting, and at the same time it is a distance equivalent to the previous one.
Definition 8.4. Two distances d 1 , d 2 over the space Ω are said to be equivalent if for any compact set K ⊂ Ω, there exists a constant C such that They are locally equivalent if for any p 0 ∈ Ω there exists a neighborhood U of p 0 such that d 1 and d 2 are equivalent over U .
The main reference for the equivalence result we are going to use is [22]. With our notation (see Remark 2.2), we consider in every point the metric with c i ∈ R + for i = 1, . . . , 4. In order to evaluate the distance d c between two points p 0 = (x 0 , y 0 , θ 0 , σ 0 ) and p 1 = (x 1 , y 1 , θ 1 , σ 1 ), we consider the constant coefficient vector field Y that induces a curve p t : [0, 1] → M relying them. We develop these evaluations in Appendix D. In the next section we are going to show an implementation of our transport model based on the distance d c .
Remark 8.6. From the previous results we know that there exists four unique paths µ ± t,e , µ ± t,o relying the associated probabilities. Using the fact that the chosen set of filters is a frame, we reconstruct the intermediary images, by defining Where the coefficients m e,t and m o,t depend on the masses ofμ ± i,e andμ ± i,o for i = 0, 1.

Implementation
In order to implement our model, we code a Gabor Wavelet Pyramid as sketched in the previous section, and evaluate the transport maps using a Sinkhorn's algorithm of the kind treated by Peyré and Cuturi [25]. In discrete setting, the transport plan between two functions u 1 , u 2 , is a matrix P . In particular, if the two functions are represented as vectors of dimension m, the matrix P has dimension m × m and verifies the property where 1 m is the vector of length m whose coordinates are all 1, and therefore (9.1) means that P ∈ Adm(u 1 , u 2 ). We recall that the transport plan P between two functions u 1 , u 2 minimizes the product P, C where C is the cost matrix associated to our setting, and P respects condition (9.1). In the case of the Sinkhorn's algorithm, with regularization coefficient ε, the minimized quantity is where H(P ) is the entropy of P . The uniqueness of the solution is in fact true for any strictly concave function H (see [12] for a wider analysis on entropy regularizations). A well known result [25, §4] states that if L C (u 1 , u 2 ) is the minimum reached by P, C under condition (9.1), and L ε C (u 1 , u 2 ) is the minimum of (9.2) under the same condition, then We observe that the plan achieving L C (u 1 , u 2 ) is not always unique in the discrete setting, while the one achieving L ε C (u 1 , u 2 ) is in fact unique as proved for example by [25,Proposition 4.3]. Moreover, if P ε is the optimal plan for L ε C , then P ε → P where P is the maximum entropy plan among those achieving the optimum L C .
In order to compute the geodesic between µ + 0,e and µ + 1,e we define the optimal plan for this case P + e := argmin P ∈Adm(µ + 0,e ,µ + 1,e ) ( P, C − ε · H(P )) . and consider the meshgrid of Remark 8.2. The pseudocode for the computation is the following.
We do the analogously for µ − t,e and µ ± t,o . In order to develop the code we implemented various algorithms developed in [25], in particular we used part of the code developed by the same authors for the Sinkhorn's algorithm and disposable online [24].
In our simulation, we set a 0 = 2, b 0 = 1 and as already said D = 32 or 64. We also set = 8 which is coherent with previous results on the orientation sampling in primates (see [5,27]). Furthermore, we set σ min = 1, 1244 and σ max = σ min · D. The pseudocode for the complete simulation is the following. The images we consider are simple shapes of the letters 'T' and 'E', the second ones rotated of about π 4 counterclockwise. We show the implementation for this input in order to emphasize the ability of our model to reconstruct rotational displacement. We also add an hammer-type shape, in order to show that the implementation works well also in the case of a multi-scale object, that is shapes with different thickness along the figure are also well preserved.
Furthermore, we compare this numerical cortical-style implementation with a classical 2dimensional planar regularized optimal transport implementation, following the general theory treated for example in [25, §3], and evaluating the transport matrix again via the Sinkhorn's algorithm. In this case we take the two inputs I 0 , I 1 and the probabilities ν 0 , ν 1 ∈ P( 1, D ) obtained by normalizing the inputs. The cost function is the quadratic cost c : 1, We point out that we applied a smoothing threshold in order to reduce the blur due to the entropic regularization in both our implementations. In particular, we passed the pictures through the sigmoid where y ∈ [0, 1] is the pixel intensity and we choose k = 30 and z 0 = 0.65.
In these reconstructed images obtained with the cortical model, we observe the conservation of the image structure along the rotational movement, while in the standard 2-d optimal transport the basic shape is lost. Therefore, lifting the input via Gabor filters and moving the output function through optimal transport tools, seems to allow an effective image deformation preserving the fundamental aspects of rigid rotational motion.
If we consider a rigid translation, our procedure works well in the case of simple shapes that are moved along their principal direction, as it is the case for the 'I' shape in Figure 6.   Reconstruction of missing images via the application of a planar optimal transport in the case of the hammer shape, with cost induced by the square of the euclidean distance, and using the the Sinkhorn's algorithm with N iter = 10000 and ε = 0.01. We also processed the image through the sigmoid Σ.

Conclusions and future developments
We formulated the general theory of a lifting of retinal inputs in a 4-dimensional cortical space, and of the time completion between two cortical outputs µ 0 , µ 1 (corresponding to inputs I 0 , I 1 in the retinal plane). We did the lifting through Gabor filters, and we considered the frame conditions that allow to project cortical measures back to the retinal space, thus Figure 5. Reconstruction of missing images via the application of a planar optimal transport, with cost induced by the square of the euclidean distance, and using the the Sinkhorn's algorithm with N iter = 10000 and ε = 0.01. We also processed the image through the sigmoid Σ. obtaining from the completion paths µ ± t on the cortical space, a retinal path I t between the original inputs.
We obtained the cortical paths via methods of optimal transport, where the cost function is the squared distance on the cortical space. We implemented these tools using a Sinkhorn's algorithm on a discretized version of the cortical manifold, and via a Gabor Wavelet Pyramid system we also implemented the retinal path I t . We tested positively our model on rigid rotational movements of multi-scale shapes, verifying the shape conservation.
The Sinkhorn's algorithm does not minimize the original cost c, γ , but a regularized version of it with an entropic correction c, γ − ε · h(γ). If γ ε is a solution to the regularized problem, we know that γ ε ε→0 − −− → γ where γ is a solution of the original problem. The difference between γ ε and γ is what induces a blurring on the resulting moving shapes in our implementation. We de-blurred the images by filtering them through a sigmoid function.
Knowing that γ ε − γ is controlled by ε, in future works we intend to follow the Rigollet-Weed model [26] and use another optimal transport technique to do the de-blurring. Indeed, we can suppose that the noise factor σ 2 in [26] depends directly on ε, and from there we can develop the Wasserstein distance minimization as conceived in the work above.
In the future we also intend to improve the representation of translational movements. In our actual implementations, translation works correctly particularly along the boundary directions. In order to extend our model by considering spatio-temporal Gabor filtering, as done for example in [4], we could improve translation in the direction ortogonal to boundaries, still implementing the optimal transport tools considered in this work and thus preserving the boundary shapes of the retinal inputs.
Anyway, the problem with displacements orthogonal to the object boundaries is linked also to the structure of the Gabor Wavelet Pyramid: in order to optimize the sampling, the pyramid frame is usually built by considering σ values that are "far" from 0, this emphasizes the boundary constrains. We will work in frame buildings that allow for different σ ranges. Proof. We rewrite again the integral as We consider the coordinate changes α = α − θ and s = r σ , obtaining the integral (A.2) S 1 ×R + 2πσ 1/2 dσ S 1 ×R + dsdα se −s 2 sin −2s sin α .
Appendix B.
The goal of this appendix is to prove that the integral is finite and for any f, g ∈ L 2 (R 2 ),
In order to prove (B.2), we observe that by the Plancherel Theorem, By Fubini's Theorem we have M f, ψ k ψ k , g dk where we used the change of coordinates (θ, σ) → h = σR −θ ξ.
Appendix C.
We obtain For any two points as above, their distance is The datasets and codes generated during and analysed during the current study are available from the corresponding author on reasonable request.