Image Morphing in Deep Feature Spaces: Theory and Applications

This paper combines image metamorphosis with deep features. To this end, images are considered as maps into a high-dimensional feature space and a structure-sensitive, anisotropic flow regularization is incorporated in the metamorphosis model proposed by Miller and Younes (Int J Comput Vis 41(1):61–84, 2001) and Trouvé and Younes (Found Comput Math 5(2):173–198, 2005). For this model, a variational time discretization of the Riemannian path energy is presented and the existence of discrete geodesic paths minimizing this energy is demonstrated. Furthermore, convergence of discrete geodesic paths to geodesic paths in the time continuous model is investigated. The spatial discretization is based on a finite difference approximation in image space and a stable spline approximation in deformation space; the fully discrete model is optimized using the iPALM algorithm. Numerical experiments indicate that the incorporation of semantic deep features is superior to intensity-based approaches.


Introduction
In mathematical imaging, image morphing is the problem of computing a visually appealing transition of two images such that semantically corresponding regions are mapped onto each other.A wellknown approach for image morphing is the metamorphosis model originally introduced by Miller, Trouvé, and Younes [MY01,TY05b,TY05a], which generalizes the flow of diffeomorphism model and the large deformation diffeomorphic metric mapping (LDDMM) which dates back to the pioneering work of Arnold [Arn66] with its exploration and extension in imaging by Dupuis, Grenander and others [DGM98, BMTY05, JM00, MTY02, VS09, VRRC12].From the perspective of the flow of diffeomorphism model, each point of the reference image is transported to the target image in an energetically optimal way such that the image intensity is preserved along the trajectories of the pixels.Here, the energy measures the total dissipation of the underlying flow.The metamorphosis model additionally allows for image intensity modulations along the trajectories by incorporating the magnitude of these modulations, which is reflected by the integrated squared material derivative of image trajectories as a penalization term in the energy functional.Recently, the metamorphosis model has been extended to images on Hadamard manifolds [NPS18,ENR19], to reproducing kernel Hilbert spaces [RY16], to functional shapes [CCT18] and to discrete measures [RY13].For a more detailed exposition of these models we refer the reader to [You10,MTY15] and the references therein.
Starting from the general framework for variational time discretization in geodesic calculus [RW15], a variational time discretization of the metamorphosis model for square-integrable images L 2 (Ω, R n ) was proposed in [BER15].Moreover, the existence of discrete geodesic paths as well as the Moscoconvergence of the time discrete to the time continuous metamorphosis model was proven.However, the classical metamorphosis model, its time discrete counterpart and the spatial discretization based on finite elements in [BER15] exhibit several drawbacks: -The comparison of images in their original gray-or color space is not invariant to natural radiometric transformations caused by lighting or material changes, shadows etc. and hence might lead to a blending along the discrete geodesic path instead of flow-induced geometric transformations.
-Texture patterns, which are important for a natural appearance of images, are often destroyed along the geodesic path due to the color-based matching.
-Sharp interfaces such as object boundaries, which frequently coincide with depth discontinuities of a scene, are in general not preserved along a geodesic path because of the strong smoothness implied by the homogeneous and isotropic variational prior for the deformation fields.
To overcome these problems originating from the intensity-based matching, we propose a multiscale feature space approach incorporating the deep convolution neural network introduced in [SZ14].In detail, this convolutional neural network, which was trained to classify the ImageNet dataset [KSH12], extracts semantic features using 19 weight layers, each composed of small 3 × 3-convolution filters with subsequent nonlinear ReLU activation functions.This network defines a feature extraction operator, where each feature map is considered as a continuous map into some higher-dimensional feature space consisting of vectors in R C , where C ranges from 64 to 512 depending on the considered scale associated with a certain network layer.Compared to the original time discrete metamorphosis model [BER15] we advocate a metamorphosis model in a deep feature space, which amounts to replacing the input images by feature vectors combining image intensities and semantic information generated by the feature extraction operator.To explicitly allow for discontinuities in the deformation fields, we introduce an anisotropic regularization of the time discrete deformation sequence.Since motion discontinuities and object interfaces in images commonly coincide, the considered anisotropy solely depends on the magnitude of image gradients.We prove the existence of discrete geodesic paths for the deep feature metamorphosis model and discuss its Mosco-convergence to the appropriate time continuous metamorphosis model in deep feature space.This in particular implies the convergence of time discrete to time continuous geodesic paths and establishes the existence of time continuous geodesics as minimizers of the time continuous metamorphosis model.
We propose a finite difference/third order B-spline discretization for the fully discrete feature space metamorphosis model and use the iPALM algorithm [PS16] for the optimization, which leads to an efficient and robust computation of morphing sequences that visually outperform the prior intensitybased finite element discretization discussed in [BER15].This scheme is significantly less sensitive to intensity modulations due to the exploitation of semantic information.
Note that this publication is an extended version of the conference proceeding [EKPR19], in which the model is adapted and in addition a rigorous mathematical analysis of this novel model is presented.In fact, the morphing sequence is no longer retrieved in a post-processing step.Instead, the color values are part of the feature vector.Different from the prior proceedings article, we prove the existence of time discrete geodesics in feature space, present a time continuous model and discuss the issue of convergence of the discrete functionals.
Notation.Throughout this paper, we assume that the image domain Ω ⊂ R n for n ∈ {2, 3} is bounded and strongly Lipschitz.We use standard notation for Lebesgue and Sobolev spaces from the image domain Ω to a Banach space X, i.e.L p (Ω, X) and H m (Ω, X) and omit X if the space is clear from the context.The associated norms are denoted by • L p (Ω) and • H m (Ω) , respectively, and the seminorm in for f ∈ H m (Ω).We use the notation C k,α (Ω, X) for Hölder spaces of order k ≥ 0 with regularity α ∈ (0, 1], the corresponding (semi)norm is The symmetric part of a matrix A ∈ R n,n is denoted by A sym , i.e.A sym = 1 2 (A + A ) and the symmetrized Jacobian of a differentiable function φ by ε[φ] = (Dφ) sym .We denote by GL + (n) the elements of GL(n) with positive determinant, and by 1 both the identity map and the identity matrix.Finally, ḟ refers to the temporal derivative of a differentiable function f .Organization.This paper is structured as follows: in section 2, we review the classical metamorphosis model and present its extension to deep feature spaces.Then, in section 3 we introduce the time discrete deep feature metamorphosis model and prove the existence of geodesic paths.In section 4, we present a time continuous metamorphosis model and comment on the Mosco-convergence in deep feature space.The fully discrete model and the optimization scheme using the iPALM algorithm are presented in section 5. Finally, in section 6 several examples demonstrate the applicability of the proposed methods to real image data.

Metamorphosis model
In this section, we briefly review the classical flow of diffeomorphism model and the metamorphosis model as its generalization.Then, we extend the metamorphosis model to the space of deep features, where we additionally incorporate an anisotropic regularization.

Flow of diffeomorphism
In what follows, we present a very short exposition of the flow of diffeomorphism model and refer the reader to [DGM98, BMTY05, JM00, MTY02] for further details.In the flow of diffeomorphism model, the temporal change of image intensities is determined by a family of diffeomorphisms (ψ(t)) t∈[0,1] : Ω → R n describing a flow transporting image intensities along particle paths.The main assumption of this model is the brightness constancy assumption, which is equivalent to a vanishing material derivative D ∂t u = u + v • Du along a path (u(t)) t∈[0,1] in the space of images, where v(t) = ψ(t) • ψ −1 (t) denotes the time-dependent Eulerian velocity.The Riemannian space of images is endowed with the following metric and path energy Note that we use ψ t as a shortcut for the function x → ψ(t, x).Here, the quadratic form L is the higher order elliptic operator where m > 1 + n 2 and λ, µ, γ > 0. Physically, the metric g ψt ( ψt , ψt ) describes the viscous dissipation in a multipolar fluid model as investigated by Nečas and Šilhavý [Nv91].The first two terms of the integrand represent the dissipation density in a Newtonian fluid and the third term can be regarded as a higher order measure for friction.Following [DGM98, Theorem 2.5], paths with a finite energy, which connect two diffeomorphisms ψ 0 = ψ A and ψ 1 = ψ B , are actually one-parameter families of diffeomorphisms.Given two image intensity functions u A , u B ∈ L 2 (Ω), an associated geodesic path is a family of images u = (u(t) : , which minimizes the path energy.The resulting flow of images is given by u(t,

Metamorphosis model in image space
The metamorphosis approach originally proposed by Miller, Trouvé, Younes and coworkers in [MY01,TY05b,TY05a] generalizes the flow of diffeomorphism model by allowing for image intensity variations along motion paths and penalizing the squared material derivative in the metric.Under the assumption that the image path u is sufficiently smooth, the metric and the path energy read as for a penalization parameter δ > 0, where z = D ∂t u = u + v • Du denotes the material derivative of u.The Lagrangian formulation of this variation of the image intensity along motion trajectories can be phrased as follows: for all s, t ∈ [0, 1] we have u(t, ψ t ) − u(s, ψ s ) = t s z(r, ψ r ) dr . (1) Hence, the flow of diffeomorphism model is the limit case of the metamorphosis model for δ → 0. This definition of the metric has two major drawbacks: In general, paths in the space of images do not exhibit any smoothness properties (neither in space nor time), and therefore the evaluation of the material derivative is not well-defined.Moreover, since different pairs (v, D ∂t u) of velocity fields and material derivatives can imply the same time derivative of the image path u, the restriction to equivalence classes of pairs is required, where two pairs are equivalent if and only if they induce the same temporal change of the image path u.
To tackle both problems, Trouvé and Younes [TY05a] proposed a nonlinear geometric structure in the space of RGB images I := L 2 (Ω, R 3 ).In detail, for a given image path u ∈ L 2 ([0, 1], I) and an associated velocity field v ∈ L 2 ((0, 1), V), where ) denotes the velocity space, the weak material derivative z ∈ L 2 ((0, 1), L 2 (Ω, R 3 )) is incorporated in the model, which is implicitly given by for a smooth test function η ∈ C ∞ c ((0, 1) × Ω).We consider (v, z) as a tangent vector in the tangent space of I at the image u and write (v, z) ∈ T u I defined by (2).Indeed, (v, z) represents a variation of the image u via transport and change of intensity.This (weak) formulation and the consideration of equivalence classes of motion fields and material derivatives inducing the same temporal change of the image intensity gives rise to the notion H 1 ([0, 1], I) for regular paths in the space of images.For details we refer the reader to [TY05a].The path energy in the metamorphosis model for a regular path u ∈ H 1 ([0, 1], I) is then defined as

Metamorphosis model in deep feature space
In this subsection, we extend the metamorphosis model to images as maps into a deep feature space with the aim to increase the reliability and robustness of the resulting morphing.To further improve the quality of the deformations, we incorporate an anisotropic regularization of the deformation field.We will compute geodesics path in the feature space To compute the geodesic sequence in the deep feature space, we extract the features f A , f B ∈ F from the fixed input images u A , u B ∈ I and define for a fixed (small) η > 0 The computation of the VGG features is composed of convolution operators and nonlinear ReLU activation functions which are both continuous mappings.Hence, it is reasonable to assume in our mathematical model that the mapping F : , we define for the fully discrete model discussed in section 5 a discrete feature operator to incorporate semantic information in image morphing based on convolutional neural networks, where C ranges from 64 to 512.The parameter η is used to scale down the RGB component mainly needed to compute the anisotropy (see below) and to primarily focus on the actual VGG features when estimating the transport.Next, we include an anisotropic elliptic operator L in our model to properly account for image structures such as sharp edges or corners.To this end, we consider an anisotropy operator a : I → L ∞ (Ω) fulfilling the following assumptions: (A1) boundedness and coercivity: c a < a[u](x) < C a for 0 < c a < C a and all u ∈ I and a.e.x ∈ Ω, In the numerical experiments, we use the operator [PM90] for fixed ξ 1 , ξ 2 > 0, where G σ is the Gaussian kernel with standard deviation σ > 0. Note that (4) satisfies (A1)-(A3).Now we are in the position to introduce the variational model for deep feature metamorphosis.Instead of generalizing the definition of regular paths and adapting the notion of a weak material derivative (2) originally proposed by Trouvé and Younes, we follow the relaxed material derivative approach proposed in [ENR19], in which the material derivative quantity is retrieved from a variational inequality.In [ENR19, Section 3], the equivalence of this energy functional and (3) in the isotropic case has been shown.Let ψ as above denote the Lagrangian flow map induced by the Eulerian motion field with ψt (x) = v(t, ψ t (x)) and ψ 0 (x) = x.Then, we replace the equality (1) (rephrased for the feature map f as for a.e.x ∈ Ω and all 1 ≥ t > s ≥ 0, where formally the scalar valued z = |z| replaces the actually vector-valued material derivative.In fact, this inequality defines a set C(f ) of admissible pairs (v, z) given a path f in L 2 ([0, 1], F).This relaxed approach will turn out to be very natural when it comes to lower semicontinuity of the path energy in the context of the existence proof for geodesic paths.For more details we refer the reader to section 4.
Definition 1 (Continuous path energy).We consider the anisotropic elliptic operator for an anisotropy weight ã ∈ L ∞ (Ω), a velocity field v ∈ V and γ, µ, λ > 0.Then, we define the path energy where denotes the set of admissible pairs of the velocity and a scalar quantity z fulfilling (5).
Let us stress that the anisotropy ã = a[P[f ]] solely takes into account local RGB values and not the actual VGG features with their discriminative multiscale characteristics.
Geodesic curves f ∈ L 2 ([0, 1], F) in the deep feature space joining f A , f B ∈ F are defined as minimizers of the path energy E among all curves with the fixed boundary conditions ) and, by using Sobolev embedding arguments, in The same observation holds for ψ −1 by noting that ψ −1 t (•) is the flow associated with the backward motion field −v(1 − t, •).This together with the variational inequality (5) and z in L 2 ((0, 1) × Ω) Using approximation by smooth functions one shows that t → f (t, •) ∈ C 0 ([0, 1], F) is uniformly continuous, and by using (A3) the mapping t → a[P[f (t, •)]] is well-defined and in C 0 ([0, 1], L ∞ (Ω)).

Variational time discretization
In this section, we develop a variational time discretization of the deep feature space metamorphosis model taking into account the approach presented in [RW15,BER15].
We define the time discrete pairwise energy for two feature maps f, f ∈ F by where Here, the set of admissible deformations is Note that the anisotropy weight only depends on the image component of the second feature f in the pairwise energy.We make the following assumptions with respect to the energy density function W: holds true.
The first two assumptions ensure existence of a minimizing deformation in (7) and the third is a consistency assumption with respect to the differential operator L required to guarantee that the below defined discrete path energy is consistent with the time continuous path energy (6).
The particular energy density function used for all numerical experiments satisfies (W1)-(W3).The first term enforces the positivity of the determinant of the Jacobian matrix of a deformation and favors a balance of shrinkage and growth as advocated in [DR04, BMR13], while the second term penalizes large deviations of the deformation from the identity.Here, the positivity constraint of the determinant of the Jacobian of the deformations prohibits interpenetration of matter [Bal81].
We proceed with the definition of the discrete path energy and the discrete geodesic between two features A discrete geodesic path morphing For arbitrary vectors f = (f 0 , . . ., f K ) ∈ F K+1 and Φ = (φ 1 , . . ., φ K ) ∈ A K we set In what follows, we will investigate the existence of discrete geodesic curves in the time discrete deep feature space metamorphosis model.To this end, we combine the proofs of the local well-posedness of the pairwise energy W with the existence result of a feature vector minimizing E D K for a fixed vector of deformations.We remark that the structure of all proofs is similar to the corresponding proofs in [BER15,Eff18] and we focus on the adaptations necessitated by the anisotropic regularization.
The following lemma, which provides an estimate for the H m (Ω)-norm of the displacement, is crucial for the well-posedness of the energy.
The last term in (11) is bounded by By using the embedding of H m (Ω, Ω) into C 1,α (Ω, Ω) and the uniform boundedness of the minimizing sequence in To control the lower order term appearing on the right-hand side of (11), we define S = {x ∈ Ω : |Dφ(x) − 1| < r W } and use (A1) and (W2) to obtain We remark that the inequality holds true, which follows from Korn's inequality and the Poincaré inequality.Thus, the lemma follows by combining (11), ( 12), ( 13) and ( 14).
Proposition 1 (Well-posedness of W).Let f ∈ F be a fixed feature vector.Under the assumptions (W1)-(W2) and (A1), there exists a constant C W (depending on Ω, m, n, γ, δ, µ, λ, c a , C W,1 , C W,2 , r W ) such that for every fixed Proof.For fixed f ∈ F, let f be a feature vector satisfying (15) for a constant C W specified below.Let {φ j } j∈N ∈ A be a minimizing sequence that converges to Since 1 ∈ A we can deduce using (W1) that for all j ∈ N. Using again the Gagliardo-Nirenberg inequality we infer that {φ j } j∈N is uniformly bounded in H m (Ω, Ω) because of the estimate Due to the reflexivity of H m (Ω, Ω) there exists a weakly convergent subsequence (not relabeled) such that φ j φ in H m (Ω, Ω).By using the Sobolev embedding theorem as well as the Arzelà-Ascoli theorem we can additionally infer that for a subsequence (again not relabeled) . Thus, by choosing C W sufficiently small and taking into account the Lipschitz continuity of the determinant we obtain det(Dφ j ) − 1 L ∞ (Ω) ≤ C det for a constant C det ∈ (0, 1) and all j ∈ N, which implies det(Dφ j ) ≥ C > 0 for a constant C. Note that all estimates remain valid for the limit deformation φ ∈ A. By [Cia88, Theorem 5.5-2] the deformations {φ j } j∈N and φ are C 1 (Ω, Ω)-diffeomorphisms.Finally, (W1) and the lower semicontinuity of the seminorm imply lim inf as j → ∞.To this end, we approximate f by smooth functions where det(D(φ j )) −1 and det(D(φ)) −1 are pointwise estimated by (1−C det ) 1 2 .Finally, by first choosing i and then j we obtain (16) and thereby verify the claim.

This proposition guarantees the existence of an admissible vector of deformations
In what follows, we prove the existence of an energy minimizing vector of features for a fixed vector of deformations.
Proposition 2. Let K ≥ 2, f A , f B ∈ F and Φ = (φ 1 , . . ., φ K ) ∈ A K be fixed.We assume that the deformations satisfy min k∈{1,...,K} for a constant c det > 0.Then, under the assumptions (W1)-(W2) and (A1)-(A2) there exists a feature vector f with f 0 = f A and f K = f B such that Proof.We consider a minimizing sequence of features A straightforward computation reveals where we used (A1), (17) and the transformation formula.Furthermore, again by (17) one obtains Thus, an induction argument (starting from k = K − 1) shows that f j = (f j 1 , . . ., f j K−1 ) is uniformly bounded in F K−1 independently of j, which implies for a subsequence (not relabeled) f j f in F K−1 .In what follows, we prove the weak lower semicontinuity of the discrete path energy along the minimizing sequence.We observe that (A2) implies a for every k = 1, . . ., K. It remains to verify the weak lower semicontinuity of the matching functional, i.e.
for every k = 1, . . ., K. To this end, we first show For every g ∈ F the transformation formula yields which proves the proposition.
We can now combine both previous propositions to prove the existence of discrete geodesics for the deep feature space metamorphosis model.
Theorem 1 (Existence of discrete geodesics).Let the assumptions (W1)-(W2) and (A1)-(A2) be satisfied, K ≥ 2 and f A ∈ F.Then, there exists a constant C E > 0, which is independent of K, such that for every and the associated vector of minimizing deformations consists of C 1 (Ω, Ω)-diffeomorphisms.
Proof.For a fixed f A ∈ F let f B satisfy (20) for a constant C E specified below.For k = 0, . . ., K let be a convex combination of the input features.We first note that is a finite upper bound for the energy.Consider the minimizing sequence which has the finite upper bound E K .Following the same line of arguments as in Proposition 1 we obtain the boundedness of Φ j in H m (Ω, Ω), which results in a weakly convergent subsequence (not relabeled) Φ j Φ in H m (Ω, Ω).Due to H m (Ω, Ω) → C 1 (Ω, Ω) one obtains Φ j → Φ in C 1 (Ω, Ω) for a further subsequence (not relabeled).By taking into account Lemma 1 we get for every j ∈ N and every k = 1, . . ., K. By adapting C E if necessary we can assume for a constant c det > 0. Taking into account [Cia88, Theorem 5.5-2] we can conclude that Φ j and Φ are C 1 (Ω, Ω)-diffeomorphisms.Using Proposition 2 we can replace f j by the energy minimizing feature vector associated with Φ j , which possibly reduces the path energy.The features f j are uniformly bounded in F K+1 , which follows from an analogous reasoning as (18).Thus, f j f holds true for a subsequence (not relabeled) in F K+1 , which implies a Finally, we verify the lower semicontinuity estimate for every k = 1, . . ., K. To this end, we take into account the decomposition The second term is estimated as in the proof of ( 16).Thus it remains to consider the convergence properties of the first term.For a test function g ∈ F we obtain using the transformation rule The right hand side converges to 0 due to the convergence (det(Dφ , which together with the lower semicontinuity of the L 2 -norm proves (21).Altogether, we observe that

Convergence of discrete geodesic paths
In this section, we provide a precise statement of the Mosco-convergence for K → ∞ of a suitable temporal extension of the time discrete path energy E K in the deep metamorphosis model to the time continuous path energy E introduced in Definition 1. Furthermore, the convergence of time discrete geodesics to time continuous geodesic paths is established, which in particular implies the existence of time continuous geodesics in the deep feature metamorphosis model with an anisotropic regularizer.We recall the definition of Mosco-convergence [Mos69], which can be seen as a modification of Γ-convergence.For further details we refer the reader to [DM93].Definition 3. Let X be a Banach space.Consider functionals {E K } K∈N and E from X to R that satisfy (i) for every sequence {x K } K∈N ⊂ X with x K x ∈ X the estimate holds true ("lim inf-inequality"), (ii) for every x ∈ X there exists a recovery sequence {x K } K∈N ⊂ X satisfying x K → x in X such that the estimate lim sup is valid.
Then {E K } K∈N converges to E in the sense of Mosco.In what follows, we define temporal extensions of all relevant quantities required for the statement of the Mosco-convergence.We remark that this construction is similar to [BER15,ENR19], where further details can be found.
To ensure that the involved deformations are diffeomorphisms and to avoid the interpenetration of matter along the morphing sequence, we replace in the definition of A the positivity constraint for the determinant by the stronger condition det(Dφ) ≥ ε for a fixed (small) ε > 0 as already proposed in [ENR19].Note that all existence results from the previous section remain valid for this modified definition.
For K ∈ N, let Φ K = (φ K 1 , . . ., φ K K ) ∈ A K be the vector of optimal deformations associated with a vector of features , and x ∈ Ω the discrete transport map . Following [Cia88, Chapter 5], the condition max k=1,...,K ) is invertible, which follows for K large enough from the lim inf-part of the proof of Theorem 2 below, and we denote the inverse by x K k (t, •).In this case, we consider the feature extension operator to define an extension , where We can now state the main theorem of this section: Theorem 2 (Mosco-convergence of the discrete path energies).Let (W1)-( W3) and (A1)-(A3) be satisfied.Then, the time discrete path energy {E K } K∈N converges to E in the sense of Mosco in the L 2 ([0, 1] × F)-topology for K → ∞.
Proof.The proof follows the structure of the Mosco-convergence proof [ENR19, Theorem 5.2 & Theorem 5.4] with adaptations required due to the incorporation of the anisotropy.Furthermore, in our case images are not pointwise maps into a general Hadamard manifold but rather maps into some Euclidean space.To keep the exposition compact, we focus here on these adaptations.To facilitate reading, we give an overview of the general structure of the proof, which retrieves the overview of the proof structure in [ENR19].Many of the technical arguments already appeared in the proof of existence of discrete geodesics in section 3 and were given there in full detail.Thus, we keep these arguments brief here.
-The identification of the image (feature) and deformation vectors is unaltered compared to the proof in [ENR19].Indeed, one obtains that the sequence f K of feature maps with uniformly bounded energy converges weakly to a feature map f ∈ L 2 ([0, 1], F) with finite energy.In fact, f K and the optimal Φ K can be retrieved from f K , where the existence of Φ K follows as in [ENR19, Lemma 5.1].
-The verification of the lower semi-continuity of the weak material derivative in the sense for z being the weak limit of z K in L 2 ((0, 1) × Ω) with is identical to the corresponding reasoning in [ENR19].At this point we also observe that the velocity field w K with w ) is uniformly bounded in L 2 ((0, 1), V) and thus converges weakly in L 2 ((0, 1), V) to some limit velocity field v.This fact is again proved following the corresponding reasoning as in [ENR19].
-Also the verification of the admissibility of the limit, i.e. (v, z) ∈ C(f ), stays unaltered compared to [ENR19].To this end, one shows that the discrete flow ψ K associated with the motion field v K (t, x) := K(φ ) for a suitable constant α > 0 to the continuous flow induced by the velocity v.Moreover, the variational inequality for the material derivative holds true in the limit.
-In the final step, the lower semi-continuity of the viscous dissipation has to be shown, i.e.
We have to show in addition to [ENR19] that a K converges strongly to a := a[P[f ]] in L ∞ ((0, 1) × Ω).To this end, we first use the uniform boundedness of f K in L ∞ ([0, 1], F), an approximation argument and (A3) to show the convergence a t) in F for every t ∈ [0, 1] using similar arguments as in step (iv) of [BER15, Theorem 4.1], which leads to a K (t, •) → a(t, •) in L ∞ (Ω) using (A2).We are left to show This equicontinuity in time is a consequence of the variational inequality, the uniform boundedness of z K in L 2 ((0, 1) × Ω), the uniform boundedness of ψ K and (ψ K ) −1 in C 0, 1 2 ([0, 1], C 1,α (Ω)) for suitable α > 0, and an approximation argument for f K .Then, the actual lower semicontinuity is verified using a Taylor expansion of W based on (W3) to relate the energy density W with L, where the accumulated remainder is of order K − 1 2 .
(ii) Recovery sequence.Before constructing the recovery sequence, we note that the infimum in (6) is actually attained with an associated pair (v, z) ∈ C(f ), which follows from [ENR19, Proposition 5.3] together with Remark 1.
-To construct the recovery sequence one considers the above pair (v, z) ∈ C(f ) with an associated flow ψ and defines φ -Next, the identification of the recovery sequence limit is done, i.e. one can show that the extension To this end, the discrete flow ψ K associated with the time discrete family of deformations Φ K is defined in the same way as in the proof of the lim inf-inequality.Following [ENR19], the convergence f K → f is implied by the variational inequality and the convergence of ψ K to the time continuous flow ψ associated with v in C 0,α ([0, 1], C 1,α (Ω)) for a suitable α > 0.
-Furthermore, we have to verify the lim sup-inequality.The leading order term of a Taylor expansion of the k-th component of the discrete path energy The remainder is of higher order following the argumentation in [ENR19].Using Jensen's inequality and for a K defined as before.Following [ENR19], we can replace ψ K k (t, •) by the identity in the limit K → ∞.We argue analogously as in the case of the lim inf-inequality to show a K → a in L ∞ ((0, 1) × Ω).Thus, we obtain lim sup Finally, the estimate follows via another application of Jensen's inequality as in [ENR19].
This theorem implies the convergence and existence of geodesic paths for the (time continuous) deep feature space metamorphosis model in the following sense: Theorem 3 (Convergence of discrete geodesics).Suppose that the assumptions (W1)-(W3) and (A1)-(A3) hold true.Let f A , f B ∈ F be fixed.For K ∈ N sufficiently large let f K be a minimizer of E K subject to f K (0) = f A and f K (1) = f B .Then, a subsequence of {f K } K∈N weakly converges in L 2 ([0, 1] × F) to a minimizer of the continuous path energy E as K → ∞.Finally, the associated sequence of discrete path energies converges to the minimal continuous path energy.
Proof.The proof is analogous to the proof of [ENR19, Theorem 5.5].

Fully discrete model in feature space
In this section, we present the fully discrete deep feature space metamorphosis model on the image domain Ω = [0, 1] 2 .We use bold face letters to differentiate discrete feature maps, images, and deformations (also considered as vectors) from their continuous counterparts.For for M, N ≥ 3 we define the computational domain and its boundary as follows: We define the discrete L p -norm of a discrete feature map f as and the set of admissible deformations is given by Furthermore, the discrete Jacobian operator D MN of φ at (i, j) ∈ Ω MN is defined as the forward finite difference operator with Neumann boundary conditions.The discrete image space and the discrete feature space are given by I MN = {u : Ω MN → R 3 } and F MN = {f : Ω MN → R 3+C }, respectively.A numerically reasonable approximation of the spatial warping operator T, which approximates the pullback of a feature channel f • φ at a point (k, l) ∈ Ω MN , is given by where s is the third order B-spline interpolation kernel.Then, the fully discrete mismatch functional Likewise, the lower order anisotropic regularization functional Ω aW(Dφ) dx is discretized as follows: For simplicity, we neglect the H m -seminorm of the deformations and thus restrict to the regularization via discretization.In summary, the fully discrete path energy in the deep metamorphosis model for a (K + 1)-tuple (f k ) K k=0 of discrete feature maps, a K-tuple (φ k ) K k=1 of discrete deformations, and a K-tuple (a k ) K k=1 of discrete anisotropies reads as Finally, a discrete geodesic path (f k ) K k=0 in feature space on a specific multiscale level of a feature hierarchy is a minimizer of E MN K subject to given discrete boundary data f 0 = f A and f K = f B .Here, f A = (ηu A , F MN (u A )) and f B = (ηu B , F MN (u B )), where F MN : I MN → {f : Ω MN → R C } denotes the fully discrete feature extraction operator.
Simple RGB model.As a first model, we consider the simple image intensity-based feature space with C = 0, in which the feature space F MN coincides with the space of RGB images I MN .Since a direct computation of the deformations on the full grid is numerically instable, we incorporate a multilevel scheme.Initially, we start on the coarsest computational domain of size M init × N init with M init = 2 −(L−1) M and N init = 2 −(L−1) N for a given L > 0 and compute a time discrete geodesic sequence for suitably resized input images f A , f B .Then, in subsequent prolongation steps, the width and the height of the computational domain are successively doubled and the initial deformations and images are obtained via a bilinear interpolation of the preceding coarse scale solutions.
Deep feature space.In the second model, image features are extracted using the prominent VGG network with 19 layers as presented in [SZ14] to incorporate semantic information in image morphing.The VGG network is particularly designed for localization and classification of objects in natural images and thus the feature decomposition of images is well-suited for semantic matching.The building blocks of this network are convolutional layers with subsequent ReLU nonlinear activation functions and max pooling layers.Here, the max pooling layers canonically yield a multiscale semantic decomposition of images.
For a given grid Ω MN , the discrete feature maps of the fixed discrete input images u A and u B are F MN [u A ] and F MN [u B ], where the operator F MN is the response of the VGG network up to the layer shown in Table 1.The discrete images u A and u B are downsampled to match the corresponding grid size M N via a bilinear interpolation.In contrast to the simple RGB model, only the deformations are prolongated via a bilinear interpolation in the multilevel approach since successive features on different multilevels are not necessarily related.To stabilize the optimization, the features on each multilevel are first optimized using the prolongated deformations.

Numerical optimization
In what follows, we present the numerical optimization scheme to compute geodesics for the fully discrete deep feature metamorphosis model.Here, we use a variant of the inertial proximal alternating linearized minimization algorithm (iPALM, [PS16]).Several numerical experiments indicate that a direct gradient based minimization of the data mismatch term D MN with respect to the deformations is challenging due to the sensitivity of the warping operator to small perturbations of the deformations.Thus, to enhance the stability of the algorithm the warping operator is linearized w.r.t. the deformation at φ ∈ A MN , which is chosen as the deformation of the previous iteration step in the algorithm.To further improve the stability of the algorithm, the linearization is based on the gradient Λ ), which yields the modified mismatch energy .
The mismatch energy can be efficiently minimized incorporating a proximal mapping, which is defined for a function f : Ω MN → (−∞, ∞] for τ > 0 as follows: The proximal operator with respect to the deformation φ for a fixed τ > 0 is given by where the function values on ∂Ω MN remain unchanged.
Algorithm 1: Algorithm for minimizing E K,MN on one multilevel.
Algorithm 1 summarizes the iteration steps for the minimization of the fully discrete path energy E K,MN , where for a specific optimization variable f the extrapolation with β > 0 of the k th path element in the j th iteration step reads as follows: Here, we use the notation L[f ] for the Lipschitz constant of the function f , which is determined by backtracking.

Numerical results
In this section, numerical results for both the RGB and the deep feature model are shown.All parameters used in the computation are specified in Table 2. Figure 1 depicts the geodesic sequences for two self-portraits by van Gogh 2 (M × N = 496 × 496) for k ∈ {0, 3, 6, 9, 12, 15} obtained with the RGB model (first row) and the deep feature model (fifth row).The superiority of the deep model compared to the simple RGB model is exemplarily visualized by the zoom (magnification factor 4) of the ear region depicted in the second and sixth row.The remaining rows contain the corresponding sequences of anisotropy weights (third/seventh row) and color-coded displacement fields (fourth/eighth row), where the hue refers to the direction of the displacements and the intensity is proportional to its norm as indicated by the leftmost color wheel.Figure 2 presents analogous results for two photos of animals3 for M × N = 512 × 512 with a zoom on the mouth region.Note that the deep model is capable of accurately deforming the carnassial teeth.
Figure 3 shows results of the deep feature model for two paintings of US presidents4 and two portraits of Catherine the Great5 .In both cases, the input images have a resolution of M × N = 512 × 512.
In all numerical experiments, the displacement fields apparently evolve over time and the involved anisotropy promotes large deformation gradients in the proximity of image interfaces.These are indicated by the sharp interfaces in the color coding of the deformations.Both models fail to match image regions with no obvious correspondence of the input images, which can be seen on the cloth regions of the self-portraits, the presidents, and the empress examples, as well as on parts of the body region and the background in the animal example, where blending artifacts occur.The deep feature model clearly outperforms the simple RGB model in regions where the semantic similarity is not reflected by the RGB color features such as the cheek and the ear in the van Gogh example as well as the teeth of the animals.Moreover, to compute a visually appealing time discrete geodesic sequence, a fourth color channel representing a manual segmentation of image regions and a color adaptation of the van Gogh self-portraits was required in [BER15].This is obsolete in the proposed deep feature based model due to the incorporation of semantic information.
Here, the first part u ∈ I of a feature vector f = (u, f ) ∈ F encodes the RGB image intensity values, the remaining component f ∈ L 2 (Ω, R C ) represents deep features, which are high-dimensional local image patterns describing the local structure of the image as a superposition on different levels of a multiscale image approximation.Let us denote by P the projection onto the image component of a feature, i.e.P[f ] = u.

Figure 1 :
Figure 1: Time discrete geodesic sequences of self-portraits by van Gogh for the RGB feature (first row) and deep feature model (fifth row) along with a zoom of the ear region with magnification factor 4 (second/sixth row) and the associated sequences of anisotropy weights (third/seventh row) and colorcoded displacement fields φ k − 1 (fourth/eighth row).Note that the intensity-based approach leads to blending artifacts indicated by the arrows, which are resolved in the deep metamorphosis model.

Figure 2 :
Figure 2: Time discrete geodesic sequences of animal photos for the RGB feature (first row) and deep feature model (fifth row) along with a zoom of the mouth region with magnification factor 4 (second/sixth row) and the associated sequences of anisotropy weights (third/seventh row) and colorcoded displacement fields φ k − 1 (fourth/eighth row).Note that the novel deep feature-based model has significantly less blending artifacts as indicated by the arrows.

Figure 3 :
Figure 3: Pairs of time discrete geodesic paths using the deep feature model and corresponding colorcoded displacement fields for paintings of US presidents (first/second row) as well as for paintings of Catherine the Great (third/forth row).

Table 1 :
Multiscale decomposition of the VGG network used for the discrete feature extraction operator F MN .

Table 2 :
The parameter values for all examples.