Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms

Carioni, Marcello; Iglesias, José A.; Walter, Daniel

doi:10.1007/s10208-023-09634-7

Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms

Published: 11 December 2023

(2023)
Cite this article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Marcello Carioni¹,
José A. Iglesias¹ &
Daniel Walter²

348 Accesses
2 Citations
Explore all metrics

Abstract

A precise characterization of the extremal points of sublevel sets of nonsmooth penalties provides both detailed information about minimizers, and optimality conditions in general classes of minimization problems involving them. Moreover, it enables the application of fully corrective generalized conditional gradient methods for their efficient solution. In this manuscript, this program is adapted to the minimization of a smooth convex fidelity term which is augmented with an unbalanced transport regularization term given in the form of a generalized Kantorovich–Rubinstein norm for Radon measures. More precisely, we show that the extremal points associated to the latter are given by all Dirac delta functionals supported in the spatial domain as well as certain dipoles, i.e., pairs of Diracs with the same mass but with different signs. Subsequently, this characterization is used to derive precise first-order optimality conditions as well as an efficient solution algorithm for which linear convergence is proved under natural assumptions. This behavior is also reflected in numerical examples for a model problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Choice of the parameters in a primal-dual algorithm for Bregman iterated variational regularization

Article 04 April 2020

Global Regularity for Minimizers of Some Anisotropic Variational Integrals

Article 08 January 2021

Error estimates for total-variation regularized minimization problems with singular dual solutions

Article Open access 01 November 2022

References

F. Angrisani, G. Ascione, L. D’Onofrio, and G. Manzo, Duality and distance formulas in Lipschitz-Hölder spaces, Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl. 31 (2020), no. 2, 401–419.
Article MathSciNet MATH Google Scholar
F. Angrisani, G. Ascione, and G. Manzo, Atomic decomposition of finite signed measures on compacts of $\mathbb{R}^n$, Ann. Fenn. Math. 46 (2021), no. 2, 643–654.
N. Boyd, G. Schiebinger, and B. Recht, The alternating descent conditional gradient method for sparse inverse problems, SIAM J. Optim. 27 (2017), no. 2, 616–639.
Article MathSciNet MATH Google Scholar
C. Boyer, A. Chambolle, Y. De Castro, V. Duval, F. De Gournay, and P. Weiss, On representer theorems and convex regularization, SIAM J. Optim. 29 (2019), no. 2, 1260–1281.
Article MathSciNet MATH Google Scholar
K. Bredies and M. Carioni, Sparsity of solutions for variational inverse problems with finite-dimensional data, Calc. Var. Partial Differential Equations 59 (2020), no. 1, 1–26.
Article MathSciNet MATH Google Scholar
K. Bredies, M. Carioni, S. Fanzon, and F. Romero, On the extremal points of the ball of the Benamou–Brenier energy, Bull. Lond. Math. Soc. 53 (2021), no. 5, 1436–1452.
Article MathSciNet MATH Google Scholar
K. Bredies, M. Carioni, S. Fanzon, and F. Romero, A generalized conditional gradient method for dynamic inverse problems with optimal transport regularization, Found. Comput. Math. 23 (2023), no. 3, 833–898.
Article MathSciNet MATH Google Scholar
K. Bredies, M. Carioni, S. Fanzon, and D. Walter, Asymptotic linear convergence of fully-corrective generalized conditional gradient methods, Math. Program. (2023), https://doi.org/10.1007/s10107-023-01975-z.
K. Bredies and H. K. Pikkarainen, Inverse problems in spaces of measures, ESAIM Control Optim. Calc. Var. 19 (2013), no. 1, 190–218.
Article MathSciNet MATH Google Scholar
H. Brezis, Functional analysis, Sobolev spaces and partial differential equations, Universitext, New York, NY: Springer, 2011.
Book Google Scholar
E. J. Candès and C. Fernandez-Granda, Towards a mathematical theory of super-resolution, Comm. Pure Appl. Math. 67 (2014), no. 6, 906–956.
Article MathSciNet MATH Google Scholar
E. Casas, C. Clason, and K. Kunisch, Parabolic control problems in measure spaces with sparse solutions, SIAM J. Control Optim. 51 (2013), no. 1, 28–63.
Article MathSciNet MATH Google Scholar
J. C. Dunn, Convergence rates for conditional gradient sequences generated by implicit step length rules, SIAM J. Control Optim. 18 (1980), no. 5, 473–487.
Article MathSciNet MATH Google Scholar
J. C. Dunn and S. Harshbarger, Conditional gradient algorithms with open loop step size rules, J. Math. Anal. Appl. 62 (1978), no. 2, 432–444.
Article MathSciNet MATH Google Scholar
V. Duval and G. Peyré, Exact support recovery for sparse spikes deconvolution, Found. Comput. Math. 15 (2015), no. 5, 1315–1355.
Article MathSciNet MATH Google Scholar
V. Duval and R. Tovey, Dynamical programming for off-the-grid dynamic inverse problems, Preprint arXiv:2112.11378 [math.OC], 2021.
I. Ekeland and R. Témam, Convex analysis and variational problems., Classics in Applied Mathematics, vol. 28, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999.
M. Frank and P. Wolfe, An algorithm for quadratic programming, Naval Research Logistics Quarterly 3 (1956), no. 1-2, 95–110.
Article MathSciNet Google Scholar
L. G. Hanin, Kantorovich-Rubinstein norm and its application in the theory of Lipschitz spaces, Proc. Amer. Math. Soc. 115 (1992), no. 2, 345–352.
Article MathSciNet MATH Google Scholar
J. A. Iglesias and D. Walter, Extremal points of total generalized variation balls in 1D: characterization and applications, J. Convex Anal. 29 (2022), no. 4, 1251–1290.
MathSciNet MATH Google Scholar
L. V. Kantorovich and G. P. Akilov, Functional analysis, Second ed., Pergamon Press, Oxford-Elmsford, N.Y., 1982.
MATH Google Scholar
P.-J. Laurent, Approximation et optimisation, Collection Enseignement des Sciences, No. 13, Hermann, Paris, 1972.
J. Lellmann, D. A. Lorenz, C. Schönlieb, and T. Valkonen, Imaging with Kantorovich-Rubinstein discrepancy, SIAM J. Imaging Sci. 7 (2014), no. 4, 2833–2859.
Article MathSciNet MATH Google Scholar
L. Métivier, R. Brossier, Q. Mérigot, E. Oudet, and J. Virieux, An optimal transport approach for seismic tomography: application to 3D full waveform inversion, Inverse Problems 32 (2016), no. 11, 115008, 36 pp.
P. Pegon, F. Santambrogio, and D. Piazzoli, Full characterization of optimal transport plans for concave costs, Discrete Contin. Dyn. Syst. 35 (2015), no. 12, 6113–6132.
Article MathSciNet MATH Google Scholar
F. Santambrogio, Optimal transport for applied mathematicians, Progress in Nonlinear Differential Equations and their Applications, vol. 87, Birkhäuser/Springer, Cham, 2015.
Google Scholar
T. Strömberg, The operation of infimal convolution, Dissertationes Math. (Rozprawy Mat.) 352 (1996), 58 pp.
M. Unser, J. Fageot, and J. P. Ward, Splines are universal solutions of linear inverse problems with generalized TV regularization, SIAM Rev. 59 (2017), no. 4, 769–793.
Article MathSciNet MATH Google Scholar
D. J. Wales and J. P. K. Doye, Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms, J. Phys. Chem. A 101 (1997), no. 28, 5111–5116.
Article Google Scholar
N. Weaver, Lipschitz algebras, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2018.
Book Google Scholar
Y. Yu, X. Zhang, and D. Schuurmans, Generalized conditional gradient for sparse estimation, J. Mach. Learn. Res. 18 (2017), Paper No. 144, 46 pp.
C. Zălinescu, Convex analysis in general vector spaces, World Scientific Publishing Co., Inc., River Edge, NJ, 2002.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics, University of Twente, 7500AE, Enschede, The Netherlands
Marcello Carioni & José A. Iglesias
Institut für Mathematik, Humboldt-Universität zu Berlin, 10117, Berlin, Germany
Daniel Walter

Authors

Marcello Carioni
View author publications
You can also search for this author in PubMed Google Scholar
José A. Iglesias
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Walter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José A. Iglesias.

Additional information

Communicated by Martin Burger.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs for Sect. 3.3.2

In this section, we collect the necessary auxiliary results for the proof of Theorem 3.15 by applying the results of [8]. For this purpose, we keep using the notation $B:=\{\mu \,|\, \Vert \mu \Vert _{{\text {KR}}_p^{\alpha ,\beta }}\le 1\}$ and further introduce $\mathcal {B}:=\overline{{\text {Ext}}(B)}^*$. Since the predual space $\mathcal {C}(\Omega )$ is separable, $\mathcal {B}$ is weak* compact and there exists a metric $d_\mathcal {B}$ which metrizes the weak* topology on $\mathcal {B}$, see [10, Theorem 3.29].

Lemma A.1

We have

$$\begin{aligned} \mathcal {B}= \{\,(\sigma /\alpha )\delta _z\;|&\;\sigma \in \{-1,+1\},~z \in \Omega \,\} \\&\cup \{\,\mathcal {D}_\beta (x,y)\;|\;(x,y) \in \Omega \times \Omega ,~0\le |x-y|^p\le 2 \alpha -\beta \,\}. \end{aligned}$$

Proof

By the characterization of ${\text {Ext}}(B)$, we first observe that

$$\begin{aligned} \mathcal {B}= \overline{\{\,(\sigma /\alpha )\delta _z\;|\;}&\overline{\sigma \in \{-1,+1\},~z \in \Omega \,\}}^*\\&\cup \overline{\left\{ \,\mathcal {D}_\beta (x,y)\;|\;(x,y) \in \Omega \times \Omega ,~0<|x-y|^p< 2 \alpha -\beta \,\right\} }^*. \end{aligned}$$

Now, let $\mu _k=(\sigma _k/\alpha ) \delta _{z_k}$, $\sigma _k\in \{-1,1\}$, $z_k\in \Omega $, $k \in \mathbb {N}$, denote a weak* convergent sequence with limit $\bar{\mu }$. Then, due to the compactness of $\Omega $, there exists a subsequence, denoted by the same symbol, with

$$\begin{aligned} (\sigma _k, z_k) \rightarrow (\bar{\sigma }, \bar{z}) \quad \text {for some}~(\bar{\sigma }, \bar{z}) \in \{-1,1\} \times \Omega . \end{aligned}$$

Setting $\tilde{\mu }=(\bar{\sigma }/\alpha )\delta _{\bar{z}}$, the associated sequence of measures satisfies

$$\begin{aligned} \langle q,\mu _k \rangle = (\sigma /\alpha ) q(z_k) \rightarrow ({\bar{\sigma }}/\alpha ) q({\bar{z}})=\langle q,{\tilde{\mu }} \rangle \quad \text {for all}~q \in \mathcal {C}(\Omega ). \end{aligned}$$

Since weak* limits are unique, $\bar{\mu }={\tilde{\mu }}$ follows.

Similarly, we see that any weak* convergent sequence $\mu _k=\mathcal {D}_\beta (x_k,y_k)$ with

$$\begin{aligned} (x_k,y_k)\in \Omega \times \Omega ,~0<|x_k-y_k|^p< 2\alpha -\beta \end{aligned}$$

necessarily satisfies for some $(\bar{x},\bar{y})\in \Omega \times \Omega $ with $0\le |\bar{x}-\bar{y}|^p\le 2\alpha -\beta $. This finishes the proof. $\square $

In order to apply the abstract convergence result of [8], we have to check some structural assumptions. First, we show that, due to Assumption $(\textbf{B2})$, the linear problem

$$\begin{aligned} \max _{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle \end{aligned}$$

admits finitely many maximizers and all of them are extremal points.

Lemma A.2

Let Assumption $(\textbf{B2})$ hold. Then, we have

$$\begin{aligned} {{\,\mathrm{arg\,max}\,}}_{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle = \left\{ ({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{\bar{z}_i}\right\} ^{\bar{N}_1}_{i=1} \cup \left\{ \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\right\} ^{\bar{N}_2}_{j=1}. \end{aligned}$$

Proof

Define

$$\begin{aligned} D:=\big \{({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{\bar{z}_i}\big \}^{\bar{N}_1}_{i=1} \cup \left\{ \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\right\} ^{\bar{N}_2}_{j=1}. \end{aligned}$$

By assumption, D is nonempty and there holds $\langle \bar{q}, \mu \rangle =1$ for all $\mu \in D$. Moreover, since $\bar{q}$ is the unique dual variable for Problem ($\mathcal {P}$) and $ \Vert \cdot \Vert _{{\text {KR}}_p^{\alpha , \beta }}$ is positively one-homogeneous, we conclude

$$\begin{aligned} \max _{\mu \in \mathcal {B}}\langle \bar{q},\mu \rangle =1 \quad \text {and thus}~D \subset {{\,\mathrm{arg\,max}\,}}_{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle \end{aligned}$$

The inverse inclusion follows immediately from Assumption $(\textbf{B2})$ which gives

$$\begin{aligned} \max _{z}|\bar{q}(z)|\le \alpha ,~\max _{(x,y)}\Psi _{\bar{q}}(x,y)\le 1, \end{aligned}$$

as well as noting that

$$\begin{aligned} \langle \bar{q},\ \sigma \delta _z \rangle =1\text { for }\sigma \in \{-1,1\}\text { and }z \in \Omega&\text { implies}~|p(z)|=\alpha ,\text { and}\\ \langle \bar{q},\ \mathcal {D}_\beta (x,y) \rangle =1 \text { with } 0\le |x-y|\le 2 \alpha -\beta&\text { is equivalent to }\Psi (x,y)=1. \end{aligned}$$

$\square $

For abbreviation, set

$$\begin{aligned} \bar{\mu }^1_i= ({{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))/\alpha ) \delta _{\bar{z}_i},~\bar{\mu }^2_j= \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j) \quad \text {for all}~i=1,\dots , \bar{N}_1,~j=1,\dots \bar{N}_2. \end{aligned}$$

Second, we have to show the existence of $d_{\mathcal {B}}$-neighborhoods $U^1_i$ of $\bar{\mu }^1_i$ and $U^2_j$ of $\bar{\mu }^2_j$ in $\mathcal {B}$, respectively, as well as of a mapping $g :{\text {Ext}}(B) \times {\text {Ext}}(B)$ and $\theta ,~C_K >0$ with

$$\begin{aligned} \Vert K(\mu -\mu ^k_j)\Vert _Y \le C_K \, g(\mu , \mu ^k_j)\ \text { and }\ 1-\langle \bar{q},\mu \rangle \ge \theta \, g(\mu ,\mu ^k_j)^2 \end{aligned}$$

(40)

for all $j=1, \dots , \bar{N}_k$, $k=1,2$, and all $\mu \in U^k_j \cap {\text {Ext}}(B)$. We claim that this satisfied for

$$\begin{aligned}&g(\mu _1,\mu _2) :=\\&\quad {\left\{ \begin{array}{ll} |z_1-z_2|+|\sigma _1-\sigma _2| &{} \mu _1= \sigma _1 \delta _{z_1},~\mu _1= \sigma _2 \delta _{z_2},~z_1,z_2 \in \Omega ,~\sigma _1,\\ &{}\sigma _2 \in \{-1,1\} \\ \left| \begin{pmatrix} x_1-x_2 \\ y_1-y_2\end{pmatrix} \right| &{} \mu _1=\mathcal {D}_\beta (x_1,y_1),~\mu _2=\mathcal {D}_\beta (x_2,y_2),~ \\ &{}(x_1,y_1), (x_2,y_2) \in \Omega \times \Omega \\ 0 &{} \text {else.} \ \end{array}\right. } \end{aligned}$$

The proof is split into two parts. First, we characterize open $d_{\mathcal {B}}$-neighborhoods around the associated extremal points.

Lemma A.3

For $0< R$ define the sets

$$\begin{aligned} U^1_i(R) :=\left\{ \,({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{z}\;|\;z \in B_R(\bar{z}_i)\,\right\} \quad \text {for all}~ i=1, \dots , \bar{N}_1, \end{aligned}$$

as well as

$$\begin{aligned} U^2_j(R) :=\left\{ \,\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\;|\;(x,y) \in B_R(\bar{x}_j)\times B_R(\bar{y}_j)\,\right\} \quad \text {for all}~ j=1, \dots , \bar{N}_2. \end{aligned}$$

Then, ${U}^1_i(R)$ is a $d_{\mathcal {B}}$-neighborhood of $({\text {sign}}(\bar{q}(\bar{z}_i)/\alpha )\delta _{\bar{z}_i}$, $i=1,\dots ,\bar{N}_1$, and $\bar{U}^2_j(R)$ is a $d_{\mathcal {B}}$-neighborhood of $\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)$, $j=1,\dots ,\bar{N}_2$. Moreover, for every $R>0$ small enough, there holds ${U}^1_i(R),~{U}^2_i(R) \subset {\text {Ext}}(B)$.

Proof

Let indices $i\in \{1, \dots , \bar{N}_1\}$ and $j\in \{1, \dots , \bar{N}_2\}$ be arbitrary but fixed. We first show the claimed statement for $\bar{U}^2_j$. Noting that $(\mathcal {B},d_{\mathcal {B}})$ is a metric space, it suffices to show that any sequence $\{\mu _k\}_k \subset \mathcal {B}$ with eventually lies in $\bar{U}^2_j$ for all $k \in \mathbb {N}$ large enough. For this purpose, assume that $\{\mu _k\}_{k}$ admits a subsequence, denoted by the same symbol, of the form $\mu _k=(\sigma _k/\alpha ) \delta _{z_k}$ for some $\sigma _k \in \{-1,1\},~z_k \in \Omega $. Then, by possibly selecting another subsequence, we get for some $\bar{\sigma } \in \{-1,1\},~\bar{z} \in \Omega $. Noting that weak* limits are unique and $\bar{\sigma }\delta _{\bar{z}} \ne \mathcal {D}_\beta (\bar{x}_j, \bar{y}_j)$ yields a contradiction. In the same way, we exclude the existence of a subsequence with $\mu _k=0$ for all k. Hence, for all $k\in \mathbb {N}$ large enough, we have $\mu _k=\mathcal {D}_\beta (x_k,y_k)$ for some $(x_k,y_k ) \in \Omega \times \Omega $ with $0< |x_k,y_k|\le 2\alpha -\beta $. By a similar contradiction argument, $(x_k,y_k) \rightarrow (\bar{x}_j,\bar{y}_j)$ has to hold. Thus, for every $k \in \mathbb {N}$ large enough, we have $(x_k,y_k) \in B_{R_2}(\bar{x}_j,\bar{y}_j)$ and thus $\mu _k \in \bar{U}^2_j$, finishing the proof. The openness of $\bar{U}^1_j$ follows by similar argument. In fact, if $\{\mu _k\}_k \subset \mathcal {B} $ satisfies

then $\mu _k=(\sigma _k/\alpha ) \delta _{z_k}$, $\sigma _k \in \{-1,1\},~z_k\in \Omega $ for all k large enough since $\bar{\mu }^1_i \ne \mathcal {D}_\beta (x,y)$ for every $(x,y)\in \Omega \times \Omega $. Moreover, from [8, Lemma 3.16], we get $\sigma _k=$ for all $k \in \mathbb {N}$ large enough. Finally, if there is a subsequence of $\{z_k\}_k$, denoted by the same symbol, with $z_k \rightarrow \bar{z}$ with $\bar{z}\ne \bar{z}_i$, then we can choose $\varphi \in \mathcal {C}(\Omega )$ satisfying $\varphi (\bar{z})=0$ and $\varphi (\bar{z}_i)=1$. For the corresponding subsequence of measures $\mu _k$, we then obtain

$$\begin{aligned} \langle \varphi , \mu _k \rangle = (\sigma _k/\alpha ) \varphi (z_k) \rightarrow ({{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))/\alpha ) \varphi (\bar{z})=0 \ne \langle \varphi ,\bar{\mu }_i \rangle \end{aligned}$$

yielding a contradiction and thus $\bar{z}=\bar{z}_i$. $\square $

Next we prove the Lipschitz and quadratic growth properties from (40).

Lemma A.4

There are $R_1,C_K>0$ with

$$\begin{aligned} \Vert K(\mu -\bar{\mu }^\ell _j) \Vert _Y \le C_K \, g(\mu , \bar{\mu }^\ell _j) \end{aligned}$$

for all $\mu \in U^\ell _j(R_1)$, $j=1,\dots ,\bar{N}_\ell $, $\ell =1,2$.

Proof

By assumption, $K_* :Y \rightarrow {\text {Lip}}(\Omega )$ is continuous. As a consequence, we immediately get

$$\begin{aligned} \Vert K(\delta _z-\delta _{\bar{z}_i}) \Vert _Y&= \sup _{\Vert v\Vert _Y \le 1} \langle K_* v, \delta _z-\delta _{\bar{z}_i}\rangle = \sup _{\Vert v\Vert _Y \le 1} \left[[K_* v ](z)-[K_* v ](\bar{z}_i)\right]\\&\le \Vert K_*\Vert _{Y, {\text {Lip}}} |z-\bar{z}_i| \end{aligned}$$

for all $z\in \Omega $. For $\mathcal {D}_\beta (\bar{x}_j, \bar{y}_j)$ we can argue similarly. For this purpose, if $R_1>0$ is small enough, we have

$$\begin{aligned} |\bar{x}_i-\bar{y}_i|^p-|x-y|^p \le c \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \end{aligned}$$

for all $(\bar{x}_i,\bar{y}_i) \in B_R(\bar{x}_j) \times B_R(\bar{y}_j)$ since $|\bar{x}_i-\bar{y}_i|>0$. As a consequence, we get

$$\begin{aligned} \Vert K(\mathcal {D}_\beta (x,y)-\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)) \Vert _Y&= \sup _{v \in Y} \langle K_* v, \mathcal {D}_\beta (x,y)-\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\rangle \\&= \sup _{y \in Y} \left[\frac{[K_*v ](x)-[K_*v ](y)}{\beta + |x-y|^p}-\frac{[K_*v ](\bar{x}_j)-[K_*v ](\bar{y}_j)}{\beta + |\bar{x}_j-\bar{y}_j|^p} \right]\\&\le D_1+ D_2 \end{aligned}$$

where we abbreviate

$$\begin{aligned} D_1&:=\frac{\Vert K_*\Vert _{Y,{\text {Lip}}}(|x-\bar{x}_j|+|y-\bar{y}_j|)}{\beta + |\bar{x}_j-\bar{y}_j|^p} \le c \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \end{aligned}$$

as well as

$$\begin{aligned} D_2&:=\left( \frac{1}{(\beta + |x-y|^p)}-\frac{1}{(\beta + |\bar{x}_j-\bar{y}_j|^p)} \right) \left( [K_*v ](x)-[K_*v ](y)\right) \\&\le 2 \Vert K_*\Vert _{Y, \mathcal {C}} \left( \frac{|\bar{x}_j-\bar{y}_j|^p-|x-y|^p}{(\beta + |x-y|^p)(\beta + |\bar{x}_j-\bar{y}_j|^p)} \right) \\&\le \frac{2c \Vert K_*\Vert _{Y, \mathcal {C}}}{\beta ^2} \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| . \end{aligned}$$

The claimed statement then follows by definition of $U^1_i(R_1)$ and $U^2_j(R_1)$ from Lemma A.3 and noting that

$$\begin{aligned} g(\mu , \bar{\mu }^1_i)=|z-\bar{z}_i| \quad \text {for all}~\mu = {{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))\delta _z \in U^1_i(R_1) \end{aligned}$$

as well as

$$\begin{aligned} g(\mu , \bar{\mu }^2_j)=\left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \quad \text {for all}~\mu = \mathcal {D}_\beta (x,y) \in U^2_i(R_1). \end{aligned}$$

Since all involved constants are independent of i and j, respectively, we conclude. $\square $

Proposition A.5

Let Assumption $(\textbf{B3})$ hold. Then, there are $\theta >0 $ and a radius $0<R_2$ with

$$\begin{aligned} 1-\langle \bar{q}, \mu \rangle \ge \theta \, g(\mu ,\bar{\mu }^\ell _j)^2 \quad \text {for all}~\mu \in U^\ell _j(R_2), \end{aligned}$$

and $j=1,\dots ,\bar{N}_\ell $, $\ell =1,2$.

Proof

Since $\bar{z}_i \in {\text {int}}\Omega $ is a global extremum of $\bar{q}$ and $(\bar{x}_j,\bar{y}_j) \in {\text {int}}\Omega \times {\text {int}}\Omega $ is a global maximum of $\Psi _{\bar{q}}$, we have $\nabla \bar{q}(\bar{z}_i)=0$ and $\nabla \Psi _{\bar{q}}(\bar{x}_j, \bar{y}_j)=0$, respectively. Using the non-degeneracy of the associated Hessians, see Assumption $(\textbf{B3})$, and the continuity of $\bar{q}$, we conclude the existence of $R_2>0$ as well as of $\theta >0$ with

$$\begin{aligned} {{\,\textrm{sign}\,}}(\bar{q}(z))= {{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i)),~1-|\bar{q}(z)|/\alpha \ge \theta \, |z-\bar{z}_i|^2 \quad \text {for all}~z \in B_{R_2}(\bar{z}_i), \end{aligned}$$

as well as

$$\begin{aligned} 1-\Psi _{\bar{q}}(x,y) \ge \theta \, \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| ^2 \quad \text {for all}~(x,y) \in B_{R_2}(\bar{x}_j,\bar{y}_j), \end{aligned}$$

by Taylor’s expansion. This implies

$$\begin{aligned} 1-\langle \bar{q}, \mu _1 \rangle =1-{{\,\textrm{sign}\,}}(\bar{q}(z)) \bar{q}(z)/\alpha =1-|\bar{q}(z)|/\alpha \ge \theta \, |z-\bar{z}_i|^2= \theta \,g(\mu _1, \bar{\mu }_i)^2, \end{aligned}$$

as well as

$$\begin{aligned} 1-\langle \bar{q}, \mu _2 \rangle =1-\Psi _{\bar{q}}(x,y) \ge 1-|\bar{q}(z)|/\alpha \ge \theta \, \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| ^2 \end{aligned}$$

for all

$$\begin{aligned} \mu _1= ({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha ) \delta _z \in U^1_i(R_2)~\quad \text {and} \quad \mu _2= \mathcal {D}_\beta (x,y) \in U^2_j(R_2). \end{aligned}$$

(41)

By Lemma A.3, all elements of $U^1_i(R_2)$ and $U^2_i(R_2)$, respectively, are of the form (41), thus finishing the proof. $\square $

Summarizing the previous observations, we conclude Theorem 3.15 using the results of from [8]:

Proof of Theorem 3.15

Summarizing our previous observations, we have that:

The function F is strongly convex around the optimal observation $\bar{y}$, see Assumption $(\textbf{B2})$.
According to Lemma A.2, there exists $\{\bar{\mu }_j\}^{\bar{N}}_{j=1} \subset {\text {Ext}}(B)$ with $\max _{\mu \in \mathcal {B}}\langle \bar{q}, \mu \rangle =\{\bar{\mu }_j\}^{\bar{N}}_{j=1}$.
The set $\{\bar{\mu }_j\}^{\bar{N}}_{j=1}$ is linearly independent, see Assumption $(\textbf{B4})$.
The unique solution $\bar{u}=\sum ^{\bar{N}}_{j=1} \bar{\gamma }_j \bar{\mu }_j$ satisfies $\bar{\gamma }_j>0$, see Assumption $(\textbf{B5})$.
There are $d_{\mathcal {B}}$-neighborhoods $U_j$ of $\bar{\mu }_j$ for $j=1,\dots ,\bar{N}$, a function $g :{\text {Ext}}(B) \times {\text {Ext}}(B) \rightarrow \mathbb {R}$ and $C_K,\theta >0$ with
$$\begin{aligned} \Vert K(\mu \!-\!\bar{\mu }_j)\Vert _Y \le C_K g(\mu , \bar{\mu }_j), 1\!-\!\langle \bar{q}, \mu \rangle \ge \theta \, g(\mu ,\bar{\mu }_j)^2\, \text {for all}~\mu \in U_j \cap {\text {Ext}}(B). \end{aligned}$$

Consequently, the assumptions of [8, Theorem 3.8] are satisfied, and applying it we conclude the linear convergence of Theorem 3.15. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Carioni, M., Iglesias, J.A. & Walter, D. Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms. Found Comput Math (2023). https://doi.org/10.1007/s10208-023-09634-7

Download citation

Received: 03 October 2022
Revised: 31 July 2023
Accepted: 29 September 2023
Published: 11 December 2023
DOI: https://doi.org/10.1007/s10208-023-09634-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms

Abstract

Access this article

Similar content being viewed by others

Choice of the parameters in a primal-dual algorithm for Bregman iterated variational regularization

Global Regularity for Minimizers of Some Anisotropic Variational Integrals

Error estimates for total-variation regularized minimization problems with singular dual solutions

References