Abstract
In this note, we provide a characterization for the set of extreme points of the Lipschitz unit ball in a specific vectorial setting. While the analysis of the case of real-valued functions is covered extensively in the literature, no information about the vectorial case has been provided up to date. Here, we aim at partially filling this gap by considering functions mapping from a finite metric space to a strictly convex Banach space that satisfy the Lipschitz condition. As a consequence, we present a representer theorem for such functions. In this setting, the number of extreme points needed to express any point inside the ball is independent of the dimension, improving the classical result from Carathéodory.
Similar content being viewed by others
1 Introduction
Let \(({\mathcal {X}}, d)\) be a metric space, \(({\mathcal {Y}}, \Vert \cdot \Vert )\) a non-trivial, strictly convex real Banach space and consider the Banach space of Lipschitz functions [25] from \({\mathcal {X}}\) to \({\mathcal {Y}}\) that vanish at the distinct point \(x_0\in {\mathcal {X}}\). Let \(L\ge 0\). We define the set
In this paper, we want to study certain structural properties of the special case \({\text {Lip}}_0^1\) (since all of the other cases can be treated analogously). In particular, we are interested in characterizing the set of its extreme points. The set of extreme points of the space \({\text {Lip}}^1_0\) above mentioned has been widely studied in the past decades for real-valued functions \(f:{\mathcal {X}}\rightarrow \mathbb {R}\) (see [11, 13, 17,18,19, 22]), and more recently in [1, 9], but no information about the set \({\text {ext}}({\text {Lip}}^1_0)\) has been provided in the case of \({\mathcal {Y}}\ne \mathbb {R}\). In this paper, we show that the latter space can be characterized when considering a finite metric space \({\mathcal {X}}=\{x_0,\ldots ,x_n\}\), \(x_0,\ldots , \, x_n\), \(n\ge 1\), being distinct points.
The name representer theorems was introduced in the field of machine learning, in particular, in the context of kernel methods [21]. In a few words, these results show that any element of a space can be expressed by a linear combination of finitely many specific points. Recently, the study of representation results has gained popularity in the setting of variational inverse problems [3, 4, 23]. Moreover, and motivated by the so-called Minkowski–Carathéodory theorem [14, Theorem III.2.3.4], which yields representations in terms of extreme points in the finite-dimensional setting, it has been observed that there is a natural connection between extreme points and representation results [12]. As shown in [7, 8], a proper characterization of extreme points may lead to efficient optimization algorithms. For these reasons, there is increasing recent interest in characterizing extreme points associated with various regularizers, see [4, 6] and [8] and [2, 5, 10, 15, 24]. Finally, we mention that Lipschitz-type constraints have also recently become of interest in the context of plug-and-play regularization [20] and monotone splitting algorithms [16]. Hence, a proper characterization of the extreme points of the Lipschitz unit ball may have a considerable impact. In this paper, we aim to partially fill this gap.
We present the characterization result in Theorem 2.2. Finally, we provide in Theorem 2.3 a representer theorem for the space \({\text {Lip}}_0^1 \) that improves the Minkowski–Carathéodory theorem, in the sense that we will see that the number of required extreme points is independent of the dimension of the space.
2 Extreme points and representer theorems
We recall that, given a convex set of a real vector space C, an extreme point of C is a point \(y\in C\) such that, if \(y=\lambda y^1 + (1-\lambda )y^2\) with \(y^1, y^2\in C\) and \(\lambda \in (0,1)\), then \(y^1=y^2=y\). In other words, an extreme point of a convex set C is a point y in C such that \(C\setminus \{y\}\) is convex. We remind the reader of the well-known fact that it is sufficient to check the extreme point condition only for \(\lambda =1/2\). We denote with \({\text {ext}}(C)\) the set of extreme points of C.
We consider, for further convenience, the following definition of the set \({\text {Lip}}_0^1\).
Note that both of the definitions of the set \({\text {Lip}}_0^1 \) are equivalent since in this case, we are only considering the images of the finite set \({\mathcal {X}}=\{x_0,\ldots ,x_n\}\) through functions f mapping to \({\mathcal {Y}}\). We provide now a preliminary lemma.
Lemma 2.1
Let \(y\in {\text {Lip}}^1_0 \). For every \(i=1,\ldots , n\), there exists \(0=i_0, \, i_1,\ldots , i_k=i\), \(k\ge 1\), such that \(\Vert y_{i_{j+1}}-y_{i_j}\Vert =d(x_{i_{j+1}},x_{i_j})\) for every \(j=0,\ldots , k-1\) if and only if there does not exist a non-empty subset \(S\subset \{1,\ldots ,n\}\) such that \(\Vert y_i-y_j\Vert <d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\).
Proof
First, we proceed by contradiction: let \(S\subset \{1,\ldots , n\}\), \(S\ne \emptyset \), be such that \(\Vert y_i-y_j\Vert <d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\), and let \(i\in S\). By hypothesis, we can choose \(k\ge 1\) with \(0=i_0,\ldots , i_k=i\) such that \(\Vert y_{i_{j+1}}-y_{i_j}\Vert =d(x_{i_j}, x_{i_{j+1}})\) for every \(j=0,\ldots , k-1\). As \(i_0=0\in S^c\) and \(i_k=i\in S\), we derive that there must exist \(j=0,\ldots , k-1\) such that \(i_{j+1}\in S\) and \(i_j\in S^c\). It follows that \(\Vert y_{i_{j+1}}-y_{i_j}\Vert <d(x_{i_{j+1}}, x_{i_{j}})\) but this contradicts the hypothesis and, hence, concludes the first part of the proof.
Conversely, let us consider the set
and suppose that \(T\ne \{1,\ldots , n\}\). Define the set \(S:=T^c\subset \{1,\ldots , n\}\), and observe that \(S\ne \emptyset \). It is left to prove that, for every \(i\in S\), \(j\in S^c\), \(\Vert y_i-y_j\Vert < d(x_i,x_j)\). Let us suppose that there exists \(i\in S\) and \(j\in S^c\) such that \(\Vert y_i-y_j\Vert =d(x_i,x_j)\). Since \(j\in S^c\), there exists \(0=i_0,\ldots , i_k=j\), \(k\ge 1\), such that \(\Vert y_{i_{\ell +1}}-y_{i_{\ell }}\Vert =d(x_{i_{\ell }},x_{i_{\ell +1}})\) for \(\ell =0,\ldots , k-1\). Since, by hypothesis, we have that \(\Vert y_i-y_j\Vert =d(x_i,x_j)\), and defining \(i_{k+1}:=i\), we obtain a path from 0 to i satisfying the equalities for \(\ell =1,\ldots , k\). This implies that \(i\in T=S^c\), a contradiction. We therefore have found that there exists \(S\subset \{1,\ldots , n\}\), \(S\ne \emptyset \), such that \(\Vert y_i-y_j\Vert <d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\). \(\square \)
Define now the set
Observe that the definition of \({\mathcal {E}}\) is motivated by the previous lemma since every point \(y\in {\mathcal {E}}\) satisfies the first condition of Lemma 2.1. We are now ready to characterize the extreme points of the set \({\text {Lip}}_0^1 \).
Theorem 2.2
We have that \({\text {ext}}({\text {Lip}}_0^1 )={\mathcal {E}}\).
Proof
First, we will prove that, if \(y\notin {\mathcal {E}}\), then \(y\notin {\text {ext}}({\text {Lip}}_0^1 )\). Let \(y\in {\text {Lip}}_0^1 \) be such that \(y\notin {\mathcal {E}}\). By the previous lemma, we get that there exists \(S\subset \{1,\ldots , n\}\), \(S\ne \emptyset \), such that \(\Vert y_i-y_j\Vert < d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\). Choose now
and observe that \(\varepsilon >0\). Moreover, choose \(v\in {\mathcal {Y}}\) such that \(\Vert v\Vert =1\) (which exists since \({\mathcal {Y}}\) is non-trivial) and set
Indeed, if we define \(y^k:=(y^k_0, y^k_1,\ldots , y^k_n)\), \(k=1, 2 \), then \(y^1\ne y^2\). Moreover, observe that
since \(y\in {\text {Lip}}_0^1\) and
Therefore, \(y^k\in {\text {Lip}}_0^1\), \(k=1, 2\) and \(y=\frac{1}{2} y^1+\frac{1}{2} y^2\), \(y^1\ne y^2\) and so \(y\notin {\text {ext}}({\text {Lip}}_0^1)\). Hence, \({\text {ext}}({\text {Lip}}_0^1)\subset {\mathcal {E}}\).
We would like to prove now that \({\mathcal {E}}\subset {\text {ext}}({\text {Lip}}_0^1 )\). Let \(y\in {\text {Lip}}_0^1 {\setminus }{\text {ext}}({\text {Lip}}_0^1 )\). We will prove that there exists \(S\subset \{1,\ldots , n\}\), \(S\ne \emptyset \), such that \(\Vert y_i-y_j\Vert <d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\). If so, by the previous lemma, this would mean that \(y\notin {\mathcal {E}}\). Since \(y\notin {\text {ext}}({\text {Lip}}_0^1 )\), there exist \(y^1, y^2\in {\text {Lip}}_0^1\), \(y^1\ne y^2\), such that \(y=\frac{1}{2} y^1 +\frac{1}{2} y^2\). Now, define the set \(S=\{i\in \{1,\ldots , n\} \, \mid \, y_i^1\ne y_i^2\}\) and observe that it is non-empty since \(y^1\ne y^2\) by hypothesis. Now, let \(i\in S\), \(j\in S^c\). Then,
In order to finish the proof, define \(a:= y_i^1 - y_j\), \(b:=y_i^2- y_j\), and observe that \(a\ne b\). Now, we distinguish two cases: if a is not proportional to b, we get
since we assumed that \({\mathcal {Y}}\) is a strictly convex space. If they are proportional, then, by possibly interchanging a and b, we have \(b=\lambda a\) for some \(\lambda \ne 1\), we can further assume that \(-1\le \lambda < 1\), and obtain that
The result immediately follows. \(\square \)
We are now ready to state the representer theorem for the space \({\text {Lip}}_0^1\). In the case of \({\mathcal {Y}}=\mathbb {R}^d\), the Minkowski–Carathéodory theorem would imply that every function in \({\text {Lip}}_0^1 \) can be represented as a convex combination of at most \(nd+1\) extreme points. We are able to improve this number up to \(n+1\) extreme points, which is independent of d, and covers the infinite-dimensional case as well.
Theorem 2.3
For every \(y\in {\text {Lip}}_0^1 \), there exist \(k\le n+1\), \(y^1,\ldots , y^k \in {\text {ext}}({\text {Lip}}_0^1 )\), and scalars \(\lambda _1,\ldots , \lambda _k\ge 0\) with \(\sum _{i=1}^k\lambda _i=1\) such that \(y=\sum _{i=1}^k\lambda _iy^i\).
Proof
Let \(y\in {\text {Lip}}_0^1\) and recall that it is of the form \(y=(y_0,\ldots , y_n)\) with \(y_0=0\). Choose \(v\in {\mathcal {Y}}\) such that \(\Vert v\Vert =1\). Define the set
Where \((y+tv)_i:=y_i+t_iv\) for every \(i=0,\ldots , n\). Moreover, observe that \(t_0=0\) for every \(t\in D\) since, if \(t_0\ne 0\), then \((y+tv)_0\ne 0\). Now, we claim that, if \(t\in {\text {ext}}(D)\), then \(y+tv\in {\text {ext}}({\text {Lip}}_0^1 )\). Indeed, if \(t\in D\) and \(y+tv\notin {\text {ext}}({\text {Lip}}_0^1)\), then there exists a subset \(S\subset \{1,\ldots , n\}\), \(S\ne \emptyset \), such that \(\Vert y_i-y_j+(t_i-t_j)v\Vert <d(x_i,x_j)\) for every \(i\in S\), \(j\in S^c\). Choose
and observe that \(\varepsilon >0\). Moreover, define
With such definitions, observe that \(t^1\ne t^2\). Now, \(y+t^kv\in {\text {Lip}}_0^1\), for \(k=1, 2\), because
since \(t\in D\) and
Then, \(t^1, t^2\in D\) and \(t=\frac{1}{2} t^1+\frac{1}{2} t^2\), \(t^1\ne t^2\), which implies that \(t\notin {\text {ext}}(D)\). Consequently, \(t\in {\text {ext}}(D)\) implies \(y+tv\in {\text {ext}}({\text {Lip}}_0^1 )\). Now, we show that D is a non-empty, convex, compact subset of \(\mathbb {R}^{n+1}\). First, note that \(0\in D\) and that convexity follows from the fact that D is the preimage of the convex set \({\text {Lip}}_0^1 \) through the affine mapping \(t\mapsto y+tv\). Moreover, boundedness follows because, for every \(t\in D\), we have
and so, for every \(i=1,\dots ,n\), we have that
It is only left to prove that D is closed. Let \((t^k)_{k\in \mathbb {N}}\) be a sequence in D converging to some \(t\in \mathbb {R}^{n+1}\). We have that
for every \(i,j=0,\dots ,n\). We obtain the result by taking limits when \(k\rightarrow \infty \). By the Krein–Milman theorem, we know that \(D=\overline{\text {conv}}({\text {ext}}(D))\). Moreover, we can apply the Minkowski–Carathéodory theorem, and since \(0\in D\), \(\textrm{span} \ D\subset \{0\}\times \mathbb {R}^n\), we have \(\textrm{dim} \ \textrm{span} \ D\le n\). Consequently, there exist \(k\le n+1\) and scalars \(\lambda _1,\ldots , \lambda _k\ge 0\) with \(\sum _{i=1}^k\lambda _i=1\) such that \(0=\sum _{i=1}^k\lambda _it^i\), with \(t^i\in {\text {ext}}(D)\), \(i=1,\ldots , k\). Finally, by the previous claim, we know that for every \(i=1,..., \, k\), if we define \(y^i:= y+t^iv\), then \(y^i\in {\text {ext}}({\text {Lip}}_0^1 )\) and, hence
concluding the proof. \(\square \)
References
Alimohammadi, D., Pazandeh, H.: Extreme points of the unit ball in the dual space of some real subspaces of Banach spaces of Lipschitz functions. ISRN Math. Anal. 2012, Art. ID 735139, 13 pp. (2012)
Ambrosio, L., Aziznejad, S., Brena, C., Unser, M.: Linear inverse problems with Hessian-Schatten total variation. Calc. Var. Partial Differential Equations 63(1), Paper No. 9, 28 pp. (2024)
Boyer, C., Chambolle, A., De Castro, Y., Duval, V., de Gournay, F., Weiss, P.: On representer theorems and convex regularization. SIAM J. Optim. 29(2), 1260–1281 (2019)
Bredies, K., Carioni, M.: Sparsity of solutions for variational inverse problems with finite-dimensional data. Calc. Var. Partial Differential Equations 59(1), Paper No. 14, 26 pp. (2020)
Bredies, K., Carioni, M., Fanzon, S.: A superposition principle for the inhomogeneous continuity equation with Hellinger-Kantorovich-regular coefficients. Comm. Partial Differential Equations 47(10), 2023–2069 (2022)
Bredies, K., Carioni, M., Fanzon, S., Romero, F.: On the extremal points of the ball of the Benamou-Brenier energy. Bull. Lond. Math. Soc. 53(5), 1436–1452 (2021)
Bredies, K., Carioni, M., Fanzon, S., Romero, F.: A generalized conditional gradient method for dynamic inverse problems with optimal transport regularization. Found. Comput. Math. 23(3), 833–898 (2023)
Bredies, K., Carioni, M., Fanzon, S., Walter, D.: Asymptotic linear convergence of fully-corrective generalized conditional gradient methods. Math. Program. (2023). https://doi.org/10.1007/s10107-023-01975-z
Bungert, L., Korolev, Y., Burger, M.: Structural analysis of an \(L\)-infinity variational problem and relations to distance functions. Pure Appl. Anal. 2(3), 703–738 (2020)
Carioni, M., Iglesias, J.A., Walter, D.: Extremal points and sparse optimization for generalized Kantorovich-Rubinstein norms. J. Convex Anal. 29(4), 1251–1290 (2022)
Cobzaş, S.: Extreme points in Banach spaces of Lipschitz functions. Mathematica (Cluj) 31(54)(1), 25–33 (1989)
Duval, V.: Faces and Extreme Points of Convex Sets for the Resolution of Inverse Problems. Habilitation à diriger des recherches, Ecole doctorale SDOSE (2022)
Farmer, J.D.: Extreme points of the unit ball of the space of Lipschitz functions. Proc. Amer. Math. Soc. 121(3), 807–813 (1994)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms. I. Fundamentals. Grundlehren der mathematischen Wissenschaften, 305. Springer, Berlin (1993)
Iglesias, J.A., Walter, D.: Extremal points of total generalized variation balls in 1D: characterization and applications. J. Convex Anal. 29(4), 1251–1290 (2022)
Pesquet, J.-C., Repetti, A., Terris, M., Wiaux, Y.: Learning maximally monotone operators for image recovery. SIAM J. Imaging Sci. 14(3), 1206–1237 (2021)
Rao, V., Roy, A.: Extreme Lipschitz functions. Math. Ann. 189, 26–46 (1970)
Rolewicz, S.: On extremal points of the unit ball in the Banach space of Lipschitz continuous functions. J. Austral. Math. Soc. Ser. A 41(1), 95–98 (1986)
Roy, A. K.: Extreme points and linear isometries of the Banach space of Lipschitz functions. Canadian J. Math. 20, 1150–1164 (1968)
Ryu, E., Liu, J., Wang, S., Chen, X., Wang, Z., Yin, W.: Plug-and-play methods provably converge with properly trained denoisers. In: Proceedings of the 36th International Conference on Machine Learning, pp. 5546–5557. Proceedings of Machine Learning Research (2019)
Schölkopf, B., Smola, A. J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press. Cambridge, MA, USA (2001)
Smarzewski, R.: Extreme points of unit balls in Lipschitz function spaces. Proc. Amer. Math. Soc. 125(5), 1391–1397 (1997)
Unser, M.: A unifying representer theorem for inverse problems and machine learning. Found. Comput. Math. 21, 1–20 (2020)
Unser, M., Fageot, J., Ward, J. P.: Splines are universal solutions of linear inverse problems with generalized TV regularization. SIAM Rev. 59(4), 769–793
Weaver, N.: Lipschitz Algebras. World Scientific Publishing Co., Inc., River Edge, NJ (1999)
Acknowledgements
The authors would like to thank Rodolfo Assereto and Enis Chenchene for the productive discussions during the secondment period of both J.C.R. and E.N. at the University of Graz. This project has been supported by the TraDE-OPT project which received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 861137. The Department of Mathematics and Scientific Computing at the University of Graz, with which K.B. is affiliated, is a member of NAWI Graz (https://nawigraz.at/en). This work represents only the view of the authors. The European Commission and other organizations are not responsible for any use that may be made of the information it contains.
Funding
Open access funding provided by Università degli Studi di Genova within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bredies, K., Rodriguez, J.C. & Naldi, E. On extreme points and representer theorems for the Lipschitz unit ball on finite metric spaces. Arch. Math. (2024). https://doi.org/10.1007/s00013-024-01978-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00013-024-01978-y