Fenchel Duality Theory and a Primal-Dual Algorithm on Riemannian Manifolds

This paper introduces a new notion of a Fenchel conjugate, which generalizes the classical Fenchel conjugation to functions defined on Riemannian manifolds. We investigate its properties, e.g., the Fenchel–Young inequality and the characterization of the convex subdifferential using the analogue of the Fenchel–Moreau Theorem. These properties of the Fenchel conjugate are employed to derive a Riemannian primal-dual optimization algorithm and to prove its convergence for the case of Hadamard manifolds under appropriate assumptions. Numerical results illustrate the performance of the algorithm, which competes with the recently derived Douglas–Rachford algorithm on manifolds of nonpositive curvature. Furthermore, we show numerically that our novel algorithm may even converge on manifolds of positive curvature.

In recent years, optimization on Riemannian manifolds has gained a lot of interest.Starting in the s, optimization on Riemannian manifolds and corresponding algorithms have been investigated; see for instance Udrişte, and the references therein.In particular, we point out the work by Rapcsák with regard to geodesic convexity in optimization on manifolds; see for instance Rapcsák, ; and Rapcsák, , Ch. .The latter reference also serves as a source for optimization problems on manifolds obtained by rephrasing equality constrained problems in vector spaces as unconstrained problems on certain manifolds.For a comprehensive textbook on optimization on matrix manifolds, see Absil, Mahony, Sepulchre, and the recent Boumal, .
With the emergence of manifold-valued imaging, for example in InSAR imaging Bürgmann, Rosen, Fielding, , data consisting of orientations for example in electron backscattered di raction (EBSD) Adams, Wright, Kunze, ; Kunze et al., , dextrous hand grasping Dirr, Helmke, Lageman, , or for di usion tensors in magnetic resonance imaging (DT-MRI), for example discussed in Pennec, Fillard, Ayache, , the development of optimization techniques and/or algorithms on manifolds (especially for non-smooth functionals) has gained a lot of attention.Within these applications, the same tasks appear as for classical, Euclidean imaging, such as denoising, inpainting or segmentation.Both Lellmann et al., as well as Weinmann, Demaret, Storath, introduced the total variation as a prior in a variational model for manifold-valued images.While the rst extends a lifting approach previously introduced for cyclic data in Strekalovskiy, Cremers, to Riemannian manifolds, the latter introduces a cyclic proximal point algorithm (CPPA) to compute a minimizer of the variational model.Such an algorithm was previously introduced by Bačák, a on CAT( ) spaces based on the proximal point algorithm introduced by Ferreira, Oliveira, on Riemannian manifolds.Based on these models and algorithms, higher order models have been derived Bergmann, Laus, et al., ; Bačák et al., ; Bergmann, Fitschen, et al., ; Bredies, Holler, et al., .Using a relaxation, the half-quadratic minimization Bergmann, Chan, et al., , also known as iteratively reweighted least squares (IRLS) Grohs, Sprecher, , has been generalized to manifold-valued image processing tasks and employs a quasi-Newton method.Finally, the parallel Douglas-Rachford algorithm (PDRA) was introduced on Hadamard manifolds Bergmann, Persch, Steidl, and its convergence proof is, to the best of our knowledge, limited to manifolds with constant nonpositive curvature.Numerically, the PDRA still performs well on arbitrary Hadamard manifolds.However, for the classical Euclidean case the Douglas-Rachford algorithm is equivalent to applying the alternating directions method of multipliers (ADMM) Gabay, Mercier, on the dual problem and hence is also equivalent to the algorithm of Chambolle, Pock, .In this paper we introduce a new notion of Fenchel duality for Riemannian manifolds, which allows us to derive a conjugate duality theory for convex optimization problems posed on such manifolds.Our theory allows new algorithmic approaches to be devised for optimization problems on manifolds.In the absence of a global concept of convexity on general Riemannian manifolds, our approach is local in nature.On so-called Hadamard manifolds, however, there is a global notion of convexity and our approach also yields a global method.
The work closest to ours is Ahmadi Kakavandi, Amini, , who introduce a Fenchel conjugacy-like concept on Hadamard metric spaces, using a quasilinearization map in terms of distances as the duality product.In contrast, our work makes use of intrinsic tools from di erential geometry such as geodesics, tangent and cotangent vectors to establish a conjugation scheme which extends the theory from locally convex vector spaces to Riemannian manifolds.We investigate the application of the correspondence of a primal problem Minimize  () +  (Λ()) ( . ) to a suitably de ned dual and derive a primal-dual algorithm on Riemannian manifolds.In the absence of a concept of linear operators between manifolds we follow the approach of Valkonen, and state an exact and a linearized variant of our newly established Riemannian Chambolle-Pock algorithm (RCPA).We then study convergence of the latter on Hadamard manifolds.Our analysis relies on a careful investigation of the convexity properties of the functions  and .We distinguish between geodesic convexity and convexity of a function composed with the exponential map on the tangent space.Both types of convexity coincide on Euclidean spaces.This renders the proposed RCPA a direct generalization of the Chambolle-Pock algorithm to Riemannian manifolds.
As an example for a problem of type ( .), we detail our algorithm for the anisotropic and isotropic total variation with squared distance data term, i. e., the variants of the ROF model on Riemannian manifolds.After illustrating the correspondence to the Euclidean (classical) Chambolle-Pock algorithm, we compare the numerical performance of the RCPA to the CPPA and the PDRA.While the latter has only been shown to converge on Hadamard manifolds of constant curvature, it performs quite well on Hadamard manifolds in general.On the other hand, the CPPA is known to possibly converge arbitrarily slowly; even in the Euclidean case.We illustrate that our linearized algorithm competes with the PDRA, and it even performs favorably on manifolds with non-negative curvature, like the sphere.
The remainder of the paper is organized as follows.In Section we recall a number of classical results from convex analysis in Hilbert spaces.In an e ort to make the paper self-contained, we also brie y state the required concepts from di erential geometry.Section is devoted to the development of a complete notion of Fenchel conjugation for functions de ned on manifolds.To this end, we extend some classical results from convex analysis and locally convex vector spaces to manifolds, like the Fenchel-Moreau Theorem (also known as the Biconjugation Theorem) and useful characterizations of the subdi erential in terms of the conjugate function.In Section we formulate the primal-dual hybrid gradient method (also referred to as the Riemannian Chambolle-Pock algorithm, RCPA) for general optimization problems on manifolds involving non-linear operators.We present an exact and a linearized formulation of this novel method and prove, under suitable assumptions, convergence for the linearized variant to a minimizer of a linearized problem on arbitrary Hadamard manifolds.As an application of our theory, Section focuses on the analysis of several total variation models on manifolds.In Section we carry out numerical experiments to illustrate the performance of our novel primal-dual algorithm.Finally, we give some conclusions and further remarks on future research in Section .

P C A D G
In this section we review some well known results from convex analysis in Hilbert spaces as well as necessary concepts from di erential geometry.We also revisit the intersection of both topics, convex analysis on Riemannian manifolds, including its subdi erential calculus.

. C A
In this subsection let  : X → R, where R R ∪ {±∞} denotes the extended real line and X is a Hilbert space with inner product (• , •) X and duality pairing • , • X * ,X , respectively.Here, X * denotes the dual space of X.When the space X and its dual X * are clear from the context, we omit the space and just write (• , •) and • , • , respectively.For standard de nitions like closedness, properness, lower semicontinuity (lsc) and convexity of  we refer the reader, e. g., to the textbooks Rockafellar, ; Bauschke, Combettes, .
De nition . .The Fenchel conjugate of a function  : X → R is de ned as the function  * : We recall some properties of the classical Fenchel conjugate function in the following lemma.
Lemma .(Bauschke, Combettes, , Ch. ).Let  ,  : X → R be proper functions,  ∈ R,  > and  ∈ X.Then the following statements hold.We now recall some results related to the de nition of the subdi erential of a proper function.
The Fenchel biconjugate  * * : X → R of a function  : X → R is given by Finally, we conclude this section with the following result known as the Fenchel-Moreau or Biconjugation Theorem.
Theorem .(Bauschke, Combettes, , Thm. . ).Given a proper function  : X → R, the equality  * * () =  () holds for all  ∈ X if and only if  is lsc and convex.In this case  * is proper as well.

. D G
This section is devoted to the collection of necessary concepts from di erential geometry.For details concerning the subsequent de nitions, the reader may wish to consult do Carmo, ; Lee, ; Jost, .
Suppose that M is a -dimensional connected, smooth manifold.The tangent space at  ∈ M is a vector space of dimension  and it is denoted by T  M. Elements of T  M, i. e., tangent vectors, will be denoted by   and   etc. or simply  and  when the base point is clear from the context.The disjoint union of all tangent spaces, i. e., The dual space of T  M is denoted by T *  M and it is called the cotangent space to M at .The disjoint union is known as the cotangent bundle.Elements of T *  M are called cotangent vectors to M at  and they will be denoted by   and   or simply  and .The natural duality product between  ∈ T  M and  ∈ T *  M is denoted by  ,  =  ( ) ∈ R.
We suppose that M is equipped with a Riemannian metric, i. e., a smoothly varying family of inner products on the tangent spaces T  M. The metric at  ∈ M is denoted by The induced norm on T  M is denoted by •  .The Riemannian metric furnishes a linear bijective correspondence between the tangent and cotangent spaces via the Riesz map and its inverse, the so-called musical isomorphisms; see Lee, , Ch. .They are de ned as . ) and its inverse, The ♯-isomorphism further introduces an inner product and an associated norm on the cotangent space T *  M, which we will also denote by (• , •)  and •  , since it is clear which inner product or norm we refer to based on the respective arguments.
The tangent vector of a curve  :  → M de ned on some open interval  is denoted by  ().A curve is said to be geodesic if the directional (covariant) derivative of its tangent in the direction of the tangent vanishes, i. e., if ∇  ( )  () = holds for all  ∈  , where ∇ denotes the Levi-Cevita connection, cf.do Carmo, , Ch. or Lee, , Thm. . .As a consequence, geodesic curves have constant speed.
We say that a geodesic connects  to  if  ( ) =  and  ( ) =  holds.Notice that a geodesic connecting  to  need not always exist, and if it exists, it need not be unique.If a geodesic connecting  to  exists, there also exists a shortest geodesic among them, which may in turn not be unique.If it is, we denote the unique shortest geodesic connecting  and  by  , .
Using the length of piecewise smooth curves, one can introduce a notion of metric (also known as Riemannian distance)  M (•, •) on M; see for instance Lee, , Ch. , pp. -.As usual, we denote by the open metric ball of radius  > with center  ∈ M.Moreover, we de ne B ∞ () =  > B  ().
We denote by  , :  → M, with  ⊂ R being an open interval containing , a geodesic starting at  with  , ( ) =  for some  ∈ T  M. We denote the subset of T  M for which these geodesics are well de ned until  = by G  .A Riemannian manifold M is said to be complete if G  = T  M holds for some, and equivalently for all  ∈ M.
The exponential map is de ned as the function exp  : G  → M with exp    , ( ).Note that exp  ( ) =  , () holds for every  ∈ [ , ].We further introduce the set G  ⊂ T  M as some open ball of radius <  ≤ ∞ about the origin such that exp  : G  → exp  (G  ) is a di eomorphism.The logarithmic map is de ned as the inverse of the exponential map, i. e., log  : exp  (G  ) → G  ⊂ T  M.
In the particular case where the sectional curvature of the manifold is nonpositive everywhere, all geodesics connecting any two distinct points are unique.If furthermore, the manifold is simply connected and complete, the manifold is called a Hadamard manifold, see Bačák, b, p. .Then the exponential and logarithmic maps are de ned globally.
Given ,  ∈ M and  ∈ T  M, we denote by P ←  the so-called parallel transport of  along a unique shortest geodesic  , .Using the musical isomorphisms presented above, we also have a parallel transport of cotangent vectors along geodesics according to Finally, by a Euclidean space we mean R  (where T  R  = R  holds), equipped with the Riemannian metric given by the Euclidean inner product.In this case, exp   =  +  and log   =  −  hold. -- Throughout this subsection, M is assumed to be a complete and connected Riemannian manifold and we are going to recall the basic concepts of convex analysis on M. The central idea is to replace straight lines in the de nition of convex sets in Euclidean vector spaces by geodesics.
De nition .(Sakai, , Def. IV. . ).A subset C ⊂ M of a Riemannian manifold M is said to be strongly convex if for any two points ,  ∈ C, there exists a unique shortest geodesic of M connecting  to , and that geodesic, denoted by  , , lies completely in C.
On non-Hadamard manifolds, the notion of strongly convex subsets can be quite restrictive.For instance, on the round sphere S  with  ≥ , a metric ball B  () is strongly convex if and only if  < / .Note that if C is strongly convex, the exponential and logarithmic maps introduce bijections between C and L C, for any  ∈ C. In particular, on a Hadamard manifold M, we have L M, = T  M.
The following de nition states the important concept of convex functions on Riemannian manifolds.
Similarly,  is called strictly or strongly convex if  •  , ful lls these properties.
() Suppose that  ⊂ M. The epigraph of a function  : Suppose that C ⊂ M is strongly convex and  : C → R, then an equivalent way to describe its lower semicontinuity (item ()) is to require that the composition is lsc for an arbitrary  ∈ C in the classical sense, where L C, is de ned in De nition . .
We now recall the notion of the subdi erential of a geodesically convex function de ned on a Riemannian manifold.
De nition .(Ferreira, Oliveira, , Udrişte, , Def. . . ).Suppose that C ⊂ M is strongly convex.The subdi erential  M  on C at a point  ∈ C of a proper, geodesically convex function  : C → R is given by In the above notation, the index M refers to the fact that it is the Riemannian subdi erential; the set C should always be clear from the context.
We further recall the de nition of the proximal map, which was generalized to Hadamard manifolds in Ferreira, Oliveira, .
De nition . .Let M be a Riemannian manifold,  : M → R be proper, and  > .The proximal map of  is de ned as Note that on Hadamard manifolds, the proximal map is single-valued for proper geodesically convex functions; see Bačák, b, Ch. .or Ferreira, Oliveira, , Lem. .for details.The following lemma is used later on to characterize the proximal map using the subdi erential on Hadamard manifolds.
Lemma .(Ferreira, Oliveira, , Lem. . ).Let  : M → R be a proper, geodesically convex function on the Hadamard manifold M. Then the equality  = prox    is equivalent to In this section we present a novel Fenchel conjugation scheme for extended real-valued functions de ned on manifolds.We generalize ideas from Bertsekas, , who de ned local conjugation on manifolds embedded in R  speci ed by nonlinear equality constraints.
Throughout this section, suppose that M is a Riemannian manifold and C ⊂ M is strongly convex.
Remark . .Note that the Fenchel conjugate  *  depends on both the strongly convex set C and on the base point .Observe as well that when M is a Hadamard manifold, it is possible to have C = M.In the particular case of the Euclidean space C = M = R  , De nition .becomes for  ∈ R  .Hence, taking  to be the zero vector we recover the classical (Euclidean) conjugate  * from De nition .with X = R  .
Due to the fact that we obtain from De nition .the following representation of the -conjugate of  : Notice that the conjugate w.r.t.base points other than  does not have a similarly simple expression.In the Euclidean setting with M = R  and  () =  −  , it is well known that We now establish a result regarding the properness of the Note that  * *  is again a function de ned on the Riemannian manifold.The relation between  * *  and  is discussed further below, as well as properties of higher order conjugates.
The following lemma proves that our de nition of the Fenchel conjugate enjoys properties ()-() stated in Lemma .for the classical de nition of the conjugate on a Hilbert space.Results parallel to properties () and () in Lemma .will be given in Lemma .and Proposition ., respectively.
Observe that an analogue of property () in Lemma .cannot be expected for  : M → R due to the lack of a concept of linearity on manifolds.
Lemma . .Suppose that C ⊂ M is strongly convex.Let ,  : C → R be proper functions,  ∈ C,  ∈ R and  > .Then the following statements hold.
Then we have for any This shows ().Similarly, we prove (): let us suppose that  () =  () +  for all  ∈ C. Then  (exp   ) =  (exp   ) +  for every  ∈ L C, .Hence, for any   ∈ T *  M we obtain Let us now prove () and suppose that  > and  (exp   ) =   (exp   ) for all  ∈ L C, .Then we have for any Suppose that  : C → R, where C is strongly convex, and ,  ,  ∈ C. The following proposition addresses the triconjugate  * * *   : T *  M → R of  , which we de ne as Proposition . .Suppose that M is a Hadamard manifold,  ∈ M and  : M → R. Then the following holds: Proof.Using De nitions ., .and ., it is easy to see that holds for all  in M. Now ( .), De nition ., and the bijectivity of exp  and log  imply that holds for all   ∈ T *  M. We now set    • exp  and use De nitions .and .to infer that holds for all   ∈ T *  M. Consequently, we obtain According to Bauschke, Combettes, , Prop. .(iii), we have  * * *  =  *  .Collecting all equalities con rms ( .).
The following is the analogue of item () in Lemma . .We continue by introducing the manifold counterpart of the Fenchel-Moreau Theorem, compare Theorem . .Given a set C ⊂ M,  ∈ C and a function  : C → R, we de ne   :

Proposition . (Fenchel-Young inequality
( . ) Throughout this section, the convexity of the function   : T  M → R is the usual convexity on the vector space T  M, i. e., for all ,  ∈ T  M and  ∈ [ , ] it holds We present two examples of functions  : M → R de ned on Hadamard manifolds such that   is convex.In the rst example,  depends on an arbitrary xed point  ∈ M. In this case, we can guarantee that   is convex only when  =  .In the second example,  is de ned on a particular Hadamard manifold and   is convex for any base point  ∈ M. It is worth emphasizing that the functions in the following examples are geodesically convex as well but in general, the convexity of  and   are unrelated and all four cases can occur.
Example . .Let M be any Hadamard manifold and  ∈ M arbitrary.Consider the function   de ned in ( . ) with  : M → R given by  () =  M ( , ) for all  ∈ M. Note that Hence, it is easy to see that   satis es ( . ) and, consequently, it is convex on T  M.
Our second example is slightly more involved.A problem involving the special case  = and  = appears in the dextrous hand grasping problem in Dirr, Helmke, Lageman, , Sect. . .
Since T  M, (• , •)  is a Hilbert space, the function   de ned in ( . ) establishes a relationship between the results of this section and the results of Section . .We will exploit this relationship in the demonstration of the following results.Proof.Since C ⊂ M is strongly convex, () follows directly from ( . ) and the fact that the map exp  : L C, → C is bijective.As for (), De nition .and the de nition of   in ( . ) imply for all  ∈ T *  M. () follows immediately from Bauschke, Combettes, , Prop. .and ().For (), take  ∈ C arbitrary.Using De nition .and () we have which concludes the proof.
In the following theorem we obtain a version of the Fenchel-Moreau Theorem .for functions de ned on Riemannian manifolds.To this end, it is worth noting that if C is strongly convex then Equality ( . ) is an immediate consequence of ( .), and will be used in the proof of the following two theorems.
Theorem holds for all   ∈ T *  M.
Proof.The proof follows directly from the fact that  *  is de ned on the vector space T *  M and that  *  is convex due to Lemma .().
To conclude this section, we state the following result, which generalizes Corollary .and shows the symmetric relation between the conjugate function and the subdi erential when the function involved is proper, convex and lsc.
Corollary . .Let  : C → R be a proper function and ,  ∈ C. If the function   de ned in ( . ) is convex and lsc on T  M, then Proof.The proof is a straightforward combination of Theorems .and .and taking as a particular cotangent vector   = P ←   in Corollary . .

O M
In this section we derive a primal-dual optimization algorithm to solve minimization problems on Riemannian manifolds of the form Minimize  () +  (Λ()),  ∈ C.
( . ) Here C ⊂ M and D ⊂ N are strongly convex sets,  : C → R and  : D → R are proper functions, and Λ : M → N is a general di erentiable map such that Λ(C) ⊂ D. Furthermore, we assume that  : C → R is geodesically convex and that is proper, convex and lsc on T  N for some  ∈ D. One model that ts these requirements is the dextrous hand grasping problem from Dirr, Helmke, Lageman, , Sect. . .There M = N = P + () is the Hadamard manifold of symmetric positive matrices,  () = trace() holds with some  ∈ M, and  () = − log det(), cf.Example . .Another model verifying the assumptions will be presented in Section .
Our algorithm requires a choice of a pair of base points  ∈ C and  ∈ D. The role of  is to serve as a possible linearization point for Λ, while  is the base point of the Fenchel conjugate for .More generally, the points can be allowed to change during the iterations.We emphasize this possibility by writing  () and  () when appropriate.
Under the standing assumptions, the following saddle-point formulation is equivalent to ( .): Minimize sup The proof of equivalence uses Theorem .applied to  and the details are left to the reader.
From now on, we will consider problem ( .), whose solution by primal-dual optimization algorithms is challenging due to the lack of a vector space structure, which implies in particular the absence of a concept of linearity of Λ.This is also the reason why we cannot derive a dual problem associated with ( . ) following the same reasoning as in vector spaces.Therefore we concentrate on the saddle-point problem ( .).Following along the lines of Valkonen, , Sect., where a system of optimality Algorithm Exact (primal relaxed) Riemannian Chambolle-Pock for ( . ) : ←  + : end while Output:  ()   conditions for the Hilbert space counterpart of the saddle-point problem ( . ) is stated, we conjecture that if ,   ∈ C × T *  N solves ( .), then it satis es the system ( . ) Motivated by Valkonen, , Sect. .we propose to replace  by , the point where we linearize the operator Λ, which suggests to consider the system for the unknowns (,   ).
Remark . .In the speci c case that X = M and Y = N are Hilbert spaces,  : X → R is continuously di erentiable, Λ : X → Y is a linear operator,  =  = , and either Λ() * has empty null space or dom  = Y, we observe (similar to Valkonen, ) that the conditions ( . ) simplify to where  ∈ X and  ∈ T *  N = Y * . .

E R C -P
In this subsection we develop the exact Riemannian Chambolle-Pock algorithm summarized in Algorithm .
The name "exact", introduced by Valkonen, , refers to the fact that the operator Λ in the dual step is used in its exact form and only the primal step employs a linearization in order to obtain the adjoint Λ() * .Indeed, our Algorithm can be interpreted as generalization of Valkonen, , Alg. . .Let us motivate the formulation of Algorithm .We start from the second inclusion in ( . ) and obtain, for any  > , the equivalent condition Similarly we obtain that the rst inclusion in ( . ) is equivalent to for any  > .Lemma .now suggests the following alternating algorithmic scheme: where Through  we perform an over-relaxation of the primal variable.This basic form of the algorithm can be combined with an acceleration by step size selection as described in Chambolle, Pock, , Sec. .This yields Algorithm .

. L R C -P
The main obstacle in deriving a complete duality theory for problem ( . ) is the lack of a concept of linearity of operators Λ between manifolds.In the previous section, we chose to linearize Λ in the primal update step only, in order to have an adjoint.By contrast, we now replace Λ by its rst order approximation Λ() ≈ exp Λ() Λ() [log  ] ( . ) everywhere throughout this section.Here Λ() : T  M → T Λ() N denotes the derivative (pushforward) of Λ at .Since Λ : T M → T N is a linear operator between tangent bundles, we can utilize the adjoint operator Λ() * : T * Λ() N → T *  M. We further point out that we can work algorithmically with cotangent vectors   ∈ T *  N with a xed base point  since, at least locally, we can obtain a cotangent vector  Λ() ∈ T * Λ() N from it by parallel transport using  Λ() = P Λ()←   .The duality pairing reads as follows: for every  ∈ C and   ∈ T *  N .
We substitute the approximation ( . ) into ( .), which yields the linearized primal problem For simplicity, we assume Λ() =  for the remainder of this subection.Hence, the analogue of the saddle-point problem ( . ) reads as follows: Minimize sup We refer to it as the linearized saddle-point problem.Similar as for ( . ) and ( .), problems ( . ) and ( . ) are equivalent by Theorem . .In addition, in contrast to ( .), we are now able to also derive a Fenchel dual problem associated with ( .).
Notice that the analogue of ( . ) is ( . ) In the situation described in Remark ., ( . ) agrees with ( .).Motivated by the statement of the linearized primal-dual pair ( .), ( . ) and saddle-point system ( .), a further development of duality theory and an investigation of the linearization error is left for future research.
Both the exact and the linearized variants of our Riemannian Chambolle-Pock algorithm (RCPA) can be stated in two variants, which over-relax either the primal variable as in Algorithm , or the dual variable as in Algorithm .In total this yields four possibilities -exact vs. linearized, and primal vs. dual over-relaxation.This generalizes the analogous cases discussed in Valkonen, for the Hilbert space setting.In each of the four cases, it is possible to allow changes in the base points, and moreover,  () may be equal or di erent from Λ( () ).Letting  () depend on  changes the linearization point of the operator, while allowing  () to change introduces di erent  () -Fenchel conjugates  *  ( ) , and it also incurs a parallel transport on the dual variable.These possibilities are re ected in the statement of Algorithm .
Reasonable choices for the base points include, e. g., to set both  () =  and  () = Λ(), for  ≥ and some  ∈ M.This choice eliminates the parallel transport in the dual update step as well as the innermost parallel transport of the primal update step.Another choice is to x just  and set  () =  () , which eliminates the parallel transport in the primal update step.It further eliminates both parallel transports of the dual variable in steps and of Algorithm .

C -P A H S
In this subsection we con rm that both Algorithm and Algorithm boil down to the classical Chambolle-Pock method in Hilbert spaces; see Chambolle, Pock, , Alg. .To this end, suppose in this subsection that M = X and N = Y are nite-dimensional Hilbert spaces with inner products (• , •) X and (• , •) Y , respectively, and that Λ : X → Y is a linear operator.In Hilbert spaces, geodesics are straight lines in the usual sense.Moreover, X and Y can be identi ed with their tangent spaces at arbitrary points, the exponential map equals addition, and the logarithmic map equals subtraction.In addition, all parallel transports are identity maps.
We are now showing that Algorithm reduces to the classical Chambolle-Pock method when  = ∈ Y is chosen.The same then holds true for Algorithm as well since Λ is already linear.Notice that the iterates  () belong to X while the iterates  () belong to Y * .We can drop the xed base point  = from their notation.Also notice that  * agrees with the classical Fenchel conjugate and it will be denoted by  * : Y → R.
We only need to consider steps , and in Algorithm .The dual update step becomes Here ♭ : Y → Y * denotes the Riesz isomorphism for the space Y. Next we address the primal update step, which reads Here ♯ : X * → X denotes the inverse Riesz isomorphism for the space X.Finally, the (primal) extrapolation step becomes p (+ ) ←  (+ ) −    () −  (+ ) =  (+ ) +    (+ ) −  () .
The steps above agree with Chambolle, Pock, , Alg. (with the roles of  and  reversed). .

C L C -P A
In the following we adapt the proof of Chambolle, Pock, to solve the linearized saddle-point problem ( .).We restrict the discussion to the case where M and N are Hadamard manifolds and C = M and D = N .Recall that in this case we have L N, = T  N so   =  • exp  holds everywhere on T  N .Moreover, we x  ∈ M and  Λ() ∈ N during the iteration and set the acceleration parameter  to zero and choose the over-relaxation parameter   ≡ in Algorithm .
Before presenting the main result of this section and motivated by the condition introduced after Valkonen, , eq.( .),we introduce the following constant i. e., the operator norm of Λ() : T  M → T  N .
Theorem . .Suppose that M and N are two Hadamard manifolds.Let  : M → R,  : N → R be proper and lsc functions, and let Λ : M → N be di erentiable.Fix  ∈ M and  Λ() ∈ N .Assume that  is geodesically convex and that   =  • exp  is convex on T  N .Suppose that the linearized saddle-point problem ( . ) has a saddle-point ,   .Choose ,  such that  < , with  de ned in ( .), and let the iterates  ()  ,  () , ξ ()  be given by Algorithm .Suppose that there exists  ∈ N such that for all  ≥ , the following holds: where

𝑛
. Then the following statements are true.
Remark . .A main di erence of Theorem .to the Hilbert space case is the condition on  ().
Restricting this theorem to the setting of Section ., the parallel transport and the logarithmic map simplify to the identity and subtraction, respectively.Then ♯ holds and hence  () simpli es to for any ξ ()  , so condition ( . ) is satis ed for all  ∈ N.
We now make the choice  =  and notice that the sum of ( .a), ( .b) and ( .e) corresponds to  ().We also notice that the rst two lines on the right hand side of ( . ) are the primal-dual gap, denoted in the following by PDG().Moreover, we set   =   .With these substitutions in ( .a)-( .e), we arrive at the estimate We continue to sum ( . ) from to  − , where we set  ( . ) Since ,   is a saddle-point, the primal-dual gap PDG() is non-negative.Moreover, assumption ( . ) and the inequality  < imply that the sequence  () ,  ()    is bounded, which is the statement ().

ROF M M
A starting point of the work of Chambolle, Pock, is the ROF ℓ -TV denoising model Rudin, Osher, Fatemi, , which was generalized to manifolds in Lellmann et al., for the so-called isotropic and anisotropic cases.This class of ℓ -TV models can be formulated in the discrete setting as follows: let  = ( , ) , ∈ M  × ,  ,  ∈ N be a manifold-valued image, i. e., each pixel  , takes values on a manifold M. Then the manifold-valued ℓ -TV energy functional reads as follows: M ( , ,  , ) + ∇ ,, ,  = ( , ) , ∈ M  × , ( . ) where  ∈ { , }.The parameter  > balances the relative in uence of the data delity and the total varation terms in ( .).Moreover, ∇ : M  × → T M  × × denotes the generalization of the one-sided nite di erence operator, which is de ned as For simplicity of notation we do not explicitly state the base point in the Riemannian metric but denote the norm on T M by •  .Depending on the value of  ∈ { , }, we call the energy functional ( . ) isotropic when  = and anisotropic for  = .Note that previous algorithms like CPPA from Weinmann, Demaret, Storath, or Douglas-Rachford (DR) from Bergmann, Persch, Steidl, are only able to tackle the anisotropic case  = due to a missing closed form of the proximal map for the isotropic TV summands.A relaxed version of the isotropic case can be computed using the half-quadratic minimization from Bergmann, Chan, et al., .Looking at the optimality conditions of the isotropic or anisotropic energy functional, the authors in Bergmann, Tenbrinck, derived and solved the corresponding -Laplace equation.This can be generalized even to all cases  > .
The minimization of ( . ) ts into the setting of the model problem ( .).Indeed, M is replaced by M  × , N = T M  × × ,  is given by the rst term in ( .), and we set Λ = ∇ and   = • ,, .The data delity term  clearly ful lls the assumptions stated in the beginning of Section , since the squared Riemannian distance function is geodesically convex on any strongly convex set C ⊂ M. In particular, when M is a Hadamard manifold, then  is geodesically convex on all of M.
While the properness and continuity of the pullback   ( ) =  (exp   ) are obvious, its convexity is investigated in the following.
Proposition . .Suppose that M is a Hadamard manifold and  ,  ∈ N. Consider M  × and N = T M  × × and  = • ,, with  ∈ [ , ∞).For arbitrary  ∈ N , de ne the pullback   : T  N → R by   ( ) =  (exp   ).Then   is a convex function on T  N .
Proof.Notice rst that, since M is Hadamard, M  × and N are Hadamard as well.Consequently,   is de ned on all of T  N .We are using the index •  to denote points in M  × and the index •  to denote tangent vectors.In particular, we denote the base point as  = (  ,   ) ∈ N .Let  = (  ,   ),  = (  ,   ) ∈ T  N and  ∈ [ , ].Finally, we set  = (  ,   ) = exp  (( − ) +  ).Notice that in view of the properties of the double tangent bundle as a Riemannian manifold, we have Therefore we obtain We apply Algorithm to solve the linearized saddle-point problem ( .).This procedure will yield an approximate minimizer of ( .).To this end we require both the Fenchel conjugate and the proximal map of .Its Fenchel dual can be stated using the dual norms, i. e.

N E
The numerical experiments are implemented in the toolbox M .(Bergmann, ) in Julia .They were run on a MacBook Pro, .Ghz Intel Core i , GB RAM, with Julia . .All our examples are based on the linearized saddle-point formulation ( . ) for ℓ -TV, solved with Algorithm .
. A S K M The rst example uses signal data M  instead of an image, where the data space is M = S , the two-dimensional sphere with the round sphere Riemannian metric.This gives us the opportunity to consider the same problem also on the embedding manifold (R ) in order to illustrate the di erence between the manifold-valued and Euclidean settings.We construct the data (  )  such that the unique minimizer of ( . ) is known in closed form.Therefore a second purpose of this problem is to compare the numerical solution obtained by Algorithm , i. e., an approximate saddle-point of the linearized problem ( .), to the solution of the original saddle-point problem ( .).Third, we wish to explore how the value  () from ( . ) behaves numerically.
Available at http://www.manoptj .org,following the same philosophy as the M version available at https:// manopt.org,see also Boumal et al., .https://ju ia ang.org The piecewise constant signal is given by for two values  ,  ∈ M speci ed below.We applied the linearized Riemannian Chambolle-Pock Algorithm with relaxation parameter  = on the dual variable as well as  =  = , and  = , i. e., without acceleration, as well as initial guesses  ( ) =  and  ( )  as the zero vector.The stopping criterion was set to iterations to compare run times on di erent manifolds.As linearization point  we use the mean of the data, which is just  =   , ( ).We further set  = Λ() for the base point of the Fenchel dual of .For the Euclidean case M = R , we obtain a shifted version of the original Chambolle-Pock algorithm, since  ≠ .
While the algorithm on M = S takes about .seconds, the Euclidean algorithm takes about .seconds for the same number of iterations, which is most likely due to the exponential and logarithmic maps as well as the parallel transport on S , which involve sines and cosines.The results obtained by the Euclidean algorithm is .• − away in terms of the Euclidean norm from the analytical minimizer  R .Notice that the convergence of the Euclidean algorithm is covered by the theory in Chambolle, Pock, .Moreover, notice that in this setting, Λ is a linear map between vector spaces.
by Bergmann, Persch, Steidl, for a similar example.It took .seconds to perform iterations in order to reach the same value of the cost function as obtained by CPPA.The main bottleneck is the approximate evaluation of the involved mean, which has to be computed in every iteration.Here we performed gradient descent steps for this purpose.
For Algorithm we set  =  = .and  = . .We choose the base point  ∈ P + ( ) × to be the constant image of unit matrices so that  = Λ() consists of zero matrices.We initialize the algorithm with  ( ) =  and  ( )  as the zero vector.Our algorithm stops after iterations, which take .seconds, when the value of ( . ) was below the value obtained by the CPPA.While the CPPA requires about half a second per iteration, our method requires a little less than a second per iteration, but it also requires only a fraction of the iteration count of CPPA.The behavior of the cost function is shown in Fig. . c, where the horizontal axis (iteration number) is shown in log scale, since the "tail" of CPPA is quite long.The resulting values of the cost function ( . ) di er, but both show a similar convergence behavior.

C
This paper introduces a novel concept of Fenchel duality for manifolds.We investigate properties of this novel duality concept and study corresponding primal-dual formulations of non-smooth optimization problems on manifolds.This leads to a novel primal-dual algorithm on manifolds, which comes in two variants, termed the exact and linearized Riemannian Chambolle-Pock algorithm.The convergence proof for the linearized version is given on arbitrary Hadamard manifolds under a suitable assumption.
It is an open question whether condition ( . ) can be removed.The convergence analysis accompanies an earlier proof of convergence for a comparable method, namely the Douglas-Rachford algorithm, where the proof is restricted to Hadamard manifolds of constant curvature.Numerical results illustrate not only that the linearized Riemannian Chambolle-Pock algorithm performs as well as state-of-theart methods on Hadamard manifolds, but it also performs similarly well on manifolds with positive sectional curvature.Note that here it also has to deal with the absence of a global convexity concept of the functional.
A more thorough investigation as well as a convergence proof for the exact variant are topics for future research.Another point of future research is an investigation of the choice of the base points  ∈ M and  ∈ N on the convergence, especially when the base points vary during the iterations.
Starting from the proper statement of the primal and dual problem for the linearization approach of Section ., further aspects are open to investigation, for instance, regularity conditions ensuring strong duality.Well-known closedness-type conditions are then available, opening in this way a new line of rich research topics for optimization on manifolds.
Another point of potential future research is the measurement of the linearization error introduced by the model from Section . .The analysis of the discrepancy term, as well as its behavior in the convergence of the linearized algorithm Algorithm , are closely related to the choice of the base points during the iteration, and should be considered in future research.
Furthermore, our novel concept of duality permits a de nition of in mal convolution and thus o ers a direct possibility to introduce the total generalized variation.In what way these novel priors correspond to existing ones, is another issue of ongoing research.Furthermore, the investigation of both a convergence rate as well as properties on manifolds with non-negative curvature are also open.

Figure . :
Figure .: Illustration of the Fenchel conjugate for the case  = as an interpretation by the tangents of slope  * .
De nition . .Let C ⊂ M and  ∈ C. We introduce the tangent subset L C, ⊂ T  M as L C,  ∈ T  M exp   ∈ C and   =  M exp  ,  , a localized variant of the pre-image of the exponential map.

Further
, since  = , the isotropic and anisotropic models ( . ) coincide.The exact minimizer  of ( . ) is piecewise constant with the same structure as the data  .Its values are  =   , () and  =   , () where  = min   M ( , ) , .Notice that the notion of geodesics are di erent for both manifolds under consideration, and thus the exact minimizers  R and  S are di erent.In the following we use  = and  = √ ( , , ) ᵀ and  = √ ( , − , ) ᵀ .The data  is shown in Fig. .a. (a) Signal  of unit vectors.(b) Minimizer with values in M = S .(c) Minimizer with values in M = R .(d) Signal of P + ( ) matrices.(e) Minimizer on M = P + ( ).

Figure . :
Figure .: Computing the minimizer of the manifold-valued ℓ -TV model for a signal of unit vectors shown in (a) with respect to both manifolds R and S with  = : (b) on (S ) and (c) on (R ) .The known e ect, loss of contrast is di erent for both cases, since on S the vector remain of unit length.The same e ect can be seen for a signal of spd matrices, i. e., P + ( ); see (d) and (e).

Figure . :
Figure .: Development of the three algorithms Cyclic Proximal (CPPA), parallel Douglas-Rachford (PDRA) as well as the linearized Riemannian Chambolle-Pock Algorithm (lRCPA) starting all from the original data in (a) reaching the nal value (image) in (b) is shown in (c), where the iterations on the x-axis are in log-scale.
The de nition of the Fenchel conjugate of  is motivated by Rockafellar, , Thm. . .
De nition . .Suppose that  : C → R, where C ⊂ M is strongly convex, and  ∈ C. The -Fenchel conjugate of  is de ned as the function  * Since  *  is proper we can pick some   ∈ dom  *  .Hence, applying De nition .we get  *  (  ) = sup  ∈ L C,   ,  −  (exp   ) < +∞, so there must exist at least one X ∈ L C, such that  (exp  X ) ∈ R.This shows that  +∞.On the other hand, let  ∈ C and take  log  .If  () were equal to −∞, then  *  (  ) = +∞ for any   ∈ T *  M, which would contradict the properness of  *  .Consequently,  is proper.